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PREFACE 


T HE present volume forming the second Part of my Wave Mechanics 
is devoted (as foreshadowed in the Preface to Part I) to the mathe¬ 
matical development of the general ideas underlying the new mechanics, 
connecting it with classical mechanics and constituting it a complete 
self-supporting theory. In building up the mathematical framework of 
this theory I have limited myself to what I consider its most essen¬ 
tial elements, leaving aside a number of questions which have a metho¬ 
dological value only (such as the group theory) or which are met with 
in the solution of special problems. 

It is my intention to consider some of these questions later on in 
connexion with the special problems which will be discussed in Part III 
(‘Advanced Special Theory’); I have carefully avoided complicating the 
general scheme of the theory by such special questions—with a few 
exceptions inserted for illustration (the relativistic theory of the hydro¬ 
gen-like atom, for example). 

To make the general scheme more comprehensible I have not spared 
space, dealing with especially important general questions (such as the 
transformation and the perturbation theory, or the relativistic theory 
of the electron) at much greater length than would be necessary from 
the point of view of an adequate presentation to a sophisticated reader. 

I must cordially thank the editors for their readiness to meet my de¬ 
mands on space, which have resulted in a book larger than was originally 
contemplated. I must also thank M. L. Urquhart and Miss B. Swirles 
for help in correcting the English and the proofs. 

The present book, like Part I, is complete in itself, and can be read 
without acquaintance with Part I, provided the reader is familiar with 
some elementary account of wave mechanics, and is ready to explore 
its mathematical depths to obtain a profounder insight into the theory 
and to prepare himself for applying it to various special problems. 

The earlier portions of this book were written in 1931 while I was in 
America; it was completed in Leningrad nearly two years later. Some 
of the shortcomings of the book are due to this interruption and the 
impossibility of revising it in 1933 from the very beginning. 

A list of the more important references for each section is given at the 
end of the book; it is followed by a short index which should enable the 
reader to locate easily all the more important subjects treated. 


LENINGRAD 

Nov . 1933 


J. F. 




CONTENTS 


I. CLASSICAL MECHANICS AS THE LIMITING FORM OF WAVE 
MECHANICS 

1. Motion in On© Dimension; Partial Reflection and Uncertainty in 

the Sign of the Velocity . . . .1 

2. Comparison between the Schrodinger and the Classical Equation 

of Motion in One Dimension; Average Velocity and Current 
Density ....... 7 

3. Generalization for Non-stationary Motion in Three Dimensions; 

The Hamilton-Jacobi Equation . . . .15 

4. Comparison of the Approximate Solutions of Schrodinger’s 

Equation; Comparison of Classical and Wave-mechanical 
Average Values . . . . . .24 

5. Motion in a Limited Region; Quantum Conditions and Average 

Values . . . . . .34 

II. OPERATORS 

6. Operational Form of Schrodinger’s Equation, and Operational 

Representation of Physical Quantities . . .47 

7. Characteristic Functions and Values of Operators; Operational 

Equations; Constants of the Motion . . . .54 

8. Probable Values of Physical Quantities and their Change with 

the Time . . . . . . .62 

9. Tho Variational Form of the Schrodinger Equation and its 

Application to the Perturbation Theory . , .68 

10. Orthogonality and Normalization of Characteristic Functions for 

Discrete and Continuous Spectra . . . .75 


IH. MATRICES 

11. Matrix Representation of Physical Quantities and Matrix Form 

of the Equations of Motion ..... 

12. The Correspondence between Matrix and Classical Mechanics . 

13. Application of the Matrix Method to Oscillatory and Rotational 

Motion ....... 

14. Matrix Representation in the Case of a Continuous Spectrum 

IV. TRANSFORMATION THEORY 

15. Restricted Transformation Theory; Matrices defined from differ¬ 

ent‘Points of View’ ..... 

16. Transformation of Matrices ..... 

17. Transformation Theory of Matrices as a Generalization of Wave 

Mechanics; Transformation of Basic Quantities 

18. Geometrical Representation of the Transformation Theory 

V. PERTURBATION THEORY 

19. Perturbation Theory not involving the Time (Method of Station¬ 

ary States) ....... 177 

20. Extension of the Preceding Theory to the Case of ‘Relative 

Degeneracy ’ and Continuous Spectra; Effect of Perturbation 
on Various Physical Quantities . . . .189 


127 

138 

148 

162 


85 

97 

106 

120 




CONTENTS 

21. Perturbation Theory involving the Time; General Processes; 


Theory of Transitions . . . . .197 

22. First Approximation; Theory of Simple Transitions . .214 

23. Second Approximation; Theory of Combined Transitions . 226 

24. Theory of Transitions for an Undefined Initial "State . . 236 


VI. RELATIVISTIC REMODELLING AND MAGNETIC GENERALI¬ 
ZATION OF THE WAVE MECHANICS OF A SINGLE 
ELECTRON 


25. Simplest Form of Relativistic Wave Mechanics. . . 239 

26. Magnetic Forces in the Approximate Non-Relativistic Wave 

Mechanics ....... 247 

27. Relativistic Wave Mechanics as a Formal Generalization of 

Maxwell's Electromagnetic Theory of Light . . 259 

28. Alternative Form of the Wave Equations; Duplicity and Quad- 

ruplicity Phenomenon ..... 268 

29. Pauli’s Approximate Theory in the Two-dimensional Matrix 

Form; the Electron’s Magnetic Moment and Angular Mo¬ 
mentum ....... 279 

30. More Exact Form of the Two-dimensional Matrix Theory; the 

Electron’s Electric Moment ..... 297 

31. The Exact Four-dimensional Matrix Theory of Dirac . .311 

32. General Treatment of the Spin Effect; Angular Momentum and 

Magnetic Moment ...... 323 

33. The Motion of an Electron in a Central Field of Force; Fine 

Structure and Zeeman Effect .... 330 

34. Negative Energy States; Positive Electrons and Neutrons . 344 

35. The Invariance of the Dirac Equation with regard to Coordinate 

Transformations ...... 349 

36. Transformation of the Dirac Equation to Curvilinear Coordi¬ 

nates ....... 363 

VII. THE PROBLEM OF MANY PARTICLES 

37. General Results. Virial Theorem, Linear and Angular Momentum 369 

38. Magnetic Forces and Spin Effects . . . .378 

39. Complex Particles treated as Material Points with Inner Coordi¬ 

nates ; Theory of Incomplete Systems . . . 386 

40. Identical Particles (Electrons) and the Exclusion Principle . 392 


VHI. REDUCTION OF THE PROBLEM OF A SYSTEM OF IDENTI¬ 
CAL PARTICLES TO THAT OF A SINGLE PARTICLE 
41. Perturbation Theory of a System of Spinless Electrons and the 


Exchange Degeneracy ..... 400 

42. Introduction of the Spin Coordinates and Solution of the Per¬ 

turbation Problem with Antisymmetrical Wave Functions . 410 

43. The Method of the Self-consistent Field with Factorized Wave 

Functions ....... 423 

44. The Method of the Self-consistent Field with Antisymmetrical 

Functions and Dirac’s Density Matrix . . . 428 

45. Approximate Solutions (Thomas-Fermi-Dirac Equation) . 439 



CONTENTS 

IX. SECOND (INTENSITY) QUANTIZATION AND QUANTUM 
ELECTRODYNAMICS 

46. Second Quantization with respect to Electrons . . . 447 

47. Intensity Quantization of Particles described in the Configura¬ 

tion Space by a Symmetrical Wave Function (Einstein -Bose 
Statistics) ....... 462 

48. Interaction between a ‘Doubly Quantized’ System and an Ordin¬ 

ary System: Application to Photons .... 474 

49. Electromagnetic Waves with Quantized Amplitudes; Thoory of 

Spontaneous Transitions and of Radiation Damping . . 484 

60. Application of Quantized Electron Waves to the Emission and 

Scattering of Radiation ..... 494 

51. Connexion between Quantized Mechanical (Electron) Waves and 

Electromagnetic Waves ..... 602 

62. The Quantum Electrodynamics of Heisenberg, Pauli, and Dirac 606 

63. Breit's Formula. Concluding Remarks . . . 612 

REFERENCES.619 

INDEX TO PART I.623 

INDEX TO PART II.626 




ADVANCED GENERAL THEORY 

I 


CLASSICAL MECHANICS AS THE LIMITING FORM 
OF WAVE MECHANICS 


1. Motion in One Dimension; Partial Reflection and Uncertainty 
in the Sign of the Velocity 


In the first part of this book we have given a general outline of the 
development and present state of wave mechanics, emphasizing the 
physical meaning of the new conceptions and avoiding, as far as pos¬ 
sible, formal questions connected with the mathematical expression of 
these new conceptions. We have thus been led astray from the old 
conceptions based on classical corpuscular mechanics, deepening, as it 
were, the abyss separating the old from the new mechanics. 

A systematic study of the formal questions referred to above reveals 
the wonderful fact that in spite of the fundamental physical difference 
between the new and the old mechanics, they are extremely similar 
from the mathematical point of view, i.e. from the point of view of 
the mathematical expression of the various physical quantities and the 
mathematical equations connecting them. This formal similarity forms 
a bridge over the abyss between the old and the new mechanics, 
enabling one to consider the latter as an extension or rather a refine¬ 
ment of the former and to establish a one-to-one correspondence 
between the old ‘classical' and the new ‘quantum’ conceptions, quan¬ 
tities, and equations—a correspondence which often looks like an 
identity. 

The existence of such a correspondence is a very instructive example 
of the fact—many times already illustrated by the development of 
physics—that a drastic revision of our physical conceptions can be 
associated with a simple improvement in the underlying mathematical 
scheme. 

We shall start by considering the simplest case of the wave-mechanical 
equation, i.e. the equation describing the stationary motion of a particle 


in one dimension: 


dx 2 


+~(W-U)^ = 0 , 


( 1 ) 


the potential energy U being supposed to depend on x only (and not 
upon t } otherwise the total energy W would not be constant). 

3995.6 B 
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If U were constant, then this equation would have a solution of the 
* onn t >fi = Ae iccx (la) 


representing a sine wave travelling in the direction of the positive z-axis, 
a being the positive square root of the expression Srr 2 m(W — U)/h 2 (sup¬ 
posed to be positive). It must be borne in mind, however, that (1 a) is 
only a particular solution of (1), the general solution being 

ip = A'e iax +A'e- iax , (lb) 

which represents the superposition of two sine waves of the same length 
travelling in opposite directions. The fact that (1) has two independent 
particular solutions, representing, under the condition U = const., 
waves travelling in opposite directions, and that its general solution is 
equal to the sum of these two, is a consequence of the fact that (1) is 
a linear equation of the second order. 

In the general case, either for a constant or a variable U(x)> the 
function ip, which is a complex quantity, can be written in the form 

ip^Ae**, . ( 2 ) 


where A — \tp\ is its modulus and <f> is its argument (both of them of 
course being real). This representation of suggests that it may be 
possible to interpret the process described by it in a way similar to 
that corresponding to expression (la), namely, as a propagation of a 
wave with a (variable) amplitude A(x) in a definite direction specified 
by the phase <p(x) (positive if d<j>jdx > 0 and negative if dfi/dx < 0). 

Such an interpretation is, however, in general wrong, as is clearly 
shown by taking for ip the expression (lb) corresponding to U = const. 
Assuming A ' and A * to be real, we get in this case 

Aco&ff) = (A'+A")cosaz, Asincf) = (A f —A' , )smocx ) 
and consequently 

A 2 = A' 2 +A" 2 +2A'A*coB2ax> (2a) 


tan <p — 


A'—A" 

A'+A* 


tan ax. 


(2 b) 


The functions A and <p can, of course, be interpreted as the resulting 
amplitude and phase at the various points, but they will not refer to 
oscillations propagated in one definite direction. It will be noticed that 
Ay instead of being constant, may oscillate with x twice as rapidly as 
the phase of each of the two component waves, and that the resulting 
phase <f> may alternately increase and decrease with increase of x . 


t We shall drop in future the time factor e~ il1TWt l h , the oscillatory character of 0 as 
a function of the time being understood. 
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Substituting (2) in (1) and taking into account the relations 

# - ¥**+*%**, 


dx dx ~ 1 dx 

~ A {c&) ei4 ‘ +lA 
we get, after cancelling the common factor 
d 2 A 


dfy 
dx 2 


dx 2 dx dx 


dx 2 * 


dx 2 




+ «■ 


A+i[2 dAd ^+A% 
dx dx' dx 2 / 


Because A, <f>, and the parameter 


a' 


* = g7T *™(W-U) 


( 3 ) 


are all real quantities, this equation can be split up into two equations: 
d 2 A 


dx 2 


+ 


r 2 AW 1, A 

a 2 — |-p-1 \A = 0, 
\dxj J 


2 + 

* dx dx ' dx 2 


0. 


(3a) 


(3 b) 


If the latter equation be divided by Ad^/dx, it can be immediately 


integrated to give 


or 


Putting d(f>/dx 


2 log A + log^ = const. 
dx 

A z( ~- = C ( ~ const.). 

C/A~ in (3 a), we get 

rf 2 A 


cfo 2 


+ {o?-~C 2 A-*)A = 0. 


( 4 ) 


(4 a) 


This equation for J. = |0| 2 is equivalent to the Schrbdinger equation 

(1) for if/, but differs from it formally by the fact that it is not linear. 
Let us assume for a moment that SchrOdinger’s equation, in the case 

of a variable U, admits of a particular solution of the type (la), 
i.e. a solution representing waves travelling in one definite direction, 
e.g. in the positive direction. We could then obviously identify A in 

(2) with the amplitude and <f> with the phase of these particular waves. 
According to the definition of phase, the change of phase corresponding 
to an increase of x by dx (the time being fixed) would be given in this 


case by 


d<f> = ^ dx 

A 


cx dx 


(A denoting the wave-length at the point considered). 
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CLASSICAL MECHANICS AS LIMITING FORM 
We should thus have the equation 



W 


which is inconsistent with (3 a) unless d 2 A jdx 2 = 0. This condition, giving 
A = ax+b , is, however, in general inconsistent with the relation (4 a), 

Q 

i.e. A 2 ol = C , unless a = -— , which means a very special assump- 

(ax+b) 2 

tion for the potential-energy function U (the preceding relation is ful¬ 
filled in particular if U == const., a being equal to zero in this case). 
We thus see that a one-sided wave propagation, corresponding to the 
motion of a particle in one definite direction, is in genera] impossible. 

From the point of view of the wave conception this result is very 
easily explained. Thus every field of force, i.e. every inhomogeneity in 
the potential energy U or the parameter a, leads to a partial reflection 
of a wave impinging on it. If the inhomogeneity is due to a discon¬ 
tinuous jump of a, the reflection is produced at the point (or plane) of 
discontinuity. If a varies continuously, the reflection is produced 
gradually (the reflected waves giving rise to reflected waves of the 
second order travelling in the initial direction, and so on). 

From the corpuscular point of view this means that a particle moving 
along the axis of a; in a field of force parallel to x may have its velocity 
reversed at every instant, so that while the magnitude of the velocity is 
a given function of x, its direction or sign remains uncertain. 

This uncertainty constitutes the fundamental difference between the 
new and the old mechanics. In the old mechanics, if the direction of 
the velocity is fixed at some initial instant, then it should remain the 
same so long as the kinetic energy W—U remains positive (a a > 0). 
Such a determinateness does not actually exist in the phenomena of 
motion. When these phenomena are described by wave mechanics, we 
find Nature in a position very similar to that of a theoretical physicist 
who, in performing complicated (and even simple!) calculations,often 
feels a strong uncertainty about the sign (+ or —) which must be 
assigned to the quantities under consideration. 

This uncertainty of sign or of direction of velocity for a given magni¬ 
tude of the latter and a given position can be regarded as an ‘uncertainty 
principle' characteristic of wave mechanics and not related directly to 
the uncertainty principle of Heisenberg. The difference between them is 
that in the latter the localization of the particle is imagined to be effected 
by means of a ‘wave packet’ involving an uncertainty not so much 
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in the direction of the velocity as in its magnitude, whereas in the 
present case there is no need for constructing such a packet, the fact 
asserted being not a definite position of the particle, but the connexion *, 
between position, which may be arbitrary (that is, specifiable in terms 
of probability only) and the magnitude of the velocity. As we have 
just seen, the uncertainty in the direction of this velocity is connected 
with the possibility of both transmission and reflection of the particle 
in every region where it is acted on by some force. At the very beginning 
of this book we came upon this possibility when attempting to interpret, 
from the corpuscular point of view, the phenomena of partial reflection 
and partial transmission of light at the boundary between two homo¬ 
geneous bodies. Later we studied it in more detail when investigating 
the motion of material particles in a field of force according to wave 
mechanics. We can sum up the results arrived at by saying that the 
indeterminateness which constitutes the characteristic distinction be¬ 
tween wave mechanics and classical mechanics is due primarily to this 
ambiguity in the result produced by a force acting on the particle. 
Whereas in classical mechanics such a force must either accelerate or 
retard the particle, reversing the direction of its motion only when the 
increase of potential energy would exceed the total energy, in wave 
mechanics a force can reverse the direction of motion, leaving the 
magnitude of the velocity unchanged, even when this force is acting 
in the direction of the motion, i.e. even when, according to classical 
mechanics, the particle should be accelerated without change of 
direction. 

So far as the relation between the wave-mechanical and the classical 
equations of motion is concerned, this uncertainty in the direction or 
in the ‘sign’ of the velocity, when its magnitude and the position of 
the particle are simultaneously fixed, is much more useful than 
Heisenberg’s uncertainty principle (which is another aspect of the 
fundamental ambiguity inherent in wave mechanics). It leads us to 
expect that the results predicted by wave mechanics will approach those 
predicted by classical mechanics as the reflection coefficient tends to zero , 
i.e. when the ambiguity due to the possibility of reflection as well as 
transmission vanishes. In this case, transmission, i.e. motion in the 
same direction, is the only issue that comes into consideration. 

It is easy to see that a decrease in the reflection coefficient is brought 
about by a decrease in the wave-length. When the wave-length becomes 
very small compared with the length over which the potential energy 
changes by an appreciable amount, the reflection produced by this 
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change of potential energy also becomes very small and vanishes in the 
limiting case A = 0 . 

This result can be illustrated by the fact, pointed out in Part I, § 12 , 
that cathode rays pass without appreciable reflection through an 
electric condenser whose thickness is very large compared with the 
wave-length, while they are appreciably reflected if this thickness is 
reduced to zero, the potential energy change remaining the same. In 
the latter case the reflection and transmission coefficients are given 
by the well-known formulae 





D = l-E = 


4aV 

(a'-i-a") 2 * 


where a' and a " are the values of the parameter a on both sides of the 
discontinuity. It may be recalled that this parameter is proportional 
to the momentum g = mv, i.e. to the velocity of the electron. When 
the velocity of the impinging electrons, that is a, increases, the jump 
AU of the potential energy remaining constant, a" also increases, while 
the difference a — a" decreases. We have in fact, according to (3), 

AU=U'-U' = 


whence 

or approximately 

that is, 


87 rhf)i AU 

HF 


a-fa"’ 

8772 m AU AU 


ct' + ot” h 2 4a 2 4 (W-U) 9 


R\ 


1 ^ 

16 


AU 


(5 a) 


\W—U\ 

Here W—U is the average kinetic energy hnv 2 of the electron on both 
sides of the discontinuity, while AU is equal to the change of this 
kinetic energy, i.e. approximately mv Av. 


We thus get R = i/—V = 

4\ v ) 4\ A / 


(5 b) 


where 


A 

mv 


Formula (5 a) shows that the reflection coefficient tends to zero when 
the velocity of the electron is increased, i.e. when the wave-length A 
tends to zero, the jump of potential energy AU remaining constant 
(AA is an infinitely small quantity of a higher order than A itself). 

This result holds, of course, not only for electrons but also for any 
other particles: their behaviour conforms more and more to the funda- 
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mental principle of classical mechanics, the principle of determinism 
which can be stated in the form 

R = 0, D = 1 

as their velocity increases. 

It should be noted that, for a given value of A U, the magnitude of 
the velocity for which R becomes inappreciable is the smaller the larger 
the mass m, since, according to (5 a), it is not the velocity itself but the 
kinetic energy \mv 2 whose ratio to bJJ determines R. 

2. Comparison between the Schrodinger and the Classical 
Equation of Motion in One Dimension; Average Velocity and 
Current Density 

Discontinuities in the potential-energy function U(x) do not, of course, 
occur in Nature. When U(x) is a continuous function of x , i.e. when 
the force has a finite value, it is possible to give another important and 
interesting formulation of the condition under which the fundamental 
ambiguity of wave mechanics disappears (i.e. the reflection coefficient 
vanishes), the wave mechanics thus reducing to classical mechanics. 
According to de Broglie’s relation A = h/mv, the wave-length of the 
waves associated with the motion of a particle is, other things being 
equal, the smaller, the smaller the value of the constant h. In reality, 
of course, the latter cannot be changed. If, however, it were not a 
universal constant, but could have any value whatsoever, then it would 
be possible to say that wave mechanics would reduce to classical 
mechanics in the limiting case h — 0; for this would mean that the 
wave-length would vanish for all values of the velocity. Consequently 
the relative change of the potential energy in a distance of the order 
of magnitude of the wave-length would also vanish, and with it the 
partial reflection which is the fundamental cause of the ambiguity 
characteristic of wave mechanics. 

This result can be proved in a general way as follows: 

Let us put ol — Srrg/h in equation (3 a), where g (— mv) is the 
magnitude of the momentum of the particle, and also 



4= t ,. 

(6) 

Multiplying (3 a) by (h/ 2tt) 2 , we get 



r, Idas' 1 , „ 

(s) 53+[" -(e) J- 1 " 0 ’ 

(6 a) 

where 

g 2 — 2m(W — U). 

(6 b) 
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It follows from this equation that in the limiting case h = 0 the func¬ 
tion s remains finite and is determined by the differential equation 

(^J=2m(W-U). (7) 

The momentum g can be determined by this function unambiguously, 
i.e. both with respect to magnitude and sign, by the equation 


which is equivalent to equation (5), corresponding to the one-sided wave 
propagation, i.e. to the motion of a particle in a definite direction. 
This direction remains arbitrary, since (7) has two solutions, namely 
dsjdx = -\-^{2m{W —U)} and ds/dx — — <J{2m(W— U)}. But once it is 
chosen for some initial instant it will remain constant so long as s is a 
continuous function of x without maxima or minima, where, of course, 
g will change its sign after passing through the value g = 0. This 
change of sign through a continuous variation corresponds to total 
reflection and has nothing to do with the discontinuous reversal of the 
sign of g which is allowed by the exact theory embodied in the wave 
equation (1) (with h > 0) and which corresponds to partial reflection. 
The difference between the exact equation (1) and the approximate 
equation (7), so far as the ambiguity in the sign, i.e. in the direction 
of the velocity, is concerned, consists in the fact that the former, being 
a linear equation of the second order, admits both signs simultaneously 
(superposition of waves travelling in opposite directions), while the 
latter, being a quadratic equation of the first order, admits either one 
sign or the other. It should be remembered that the exact equation 
which is satisfied by the function s is much more complicated than (7). 
This exact equation can be obtained by eliminating A from equations 
(3 a) and (3 b) with <f> — 2ns/h. 

It is often convenient to use, instead of the function defined in this 
-way, another function S defined by the equation 

f (8) 

or S = ^logf (8a) 

This S is connected with 8 (i.e. the ‘phase’ <f>) and the ‘amplitude’ A by 
the relation , 

S = «+£-.log4. 
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It is a oomplex quantity which represents both <f> and A and is equi¬ 
valent to if). 

Substituting the expression (8) in Schrodinger’s equation (I) and 
using the relations 


# = .2rr dS e , 2nSlh . 

dx k dx 


d 2 ift 
dx 2 




.2 7T(PS 

dx 2 


e i2nS)h f 


we get 


h 


d 2 S . (dSV 


2ni dx 2 \ dx 


)‘ 


2m{W—U). 


(8 b) 


If we put here h = 0, this equation reduces to (7), so that when h — 0 
the two functions s and S become identical. We must now investigate 
the meaning of the approximate equation (7) which they both satisfy 
in this limiting case. 

In a certain sense it merely expresses the law of the conservation of 
energy—since ds/dx is, by definition, the momentum g of the particle 

and — is its kinetic energy. 

2m \dx) 

The equation is unusual, however, in that the momentum of the 
particle, and consequently its velocity, is determined as a function of 
the coordinate x, whereas in the classical description of motion the 
velocity, as well as the coordinate itself, usually appear as functions of 
the time t. Such a description of motion is impossible in wave mechanics 
because of the uncertainty in the direction of the velocity. If it is 
true, however, that in the case h = 0 the wave-mechanical equation 
of motion (8 b) must reduce to the classical equation, then equation 
(7) must be equivalent to Newton’s equation of motion 


d*x _ _dU 
m dt 2 dx 1 


( 0 ) 


defining x and v = dx/dt as functions of the time. This equivalence is 
readily recognized as soon as we realize what is meant by defining the 
velocity (or momentum) of a particle as a function of its coordinate. 
Let us suppose that equation (9) has been integrated, and that x and 
v have been determined as functions of the time t. Then, eliminating 
the time t between them, we can express one of them, e.g. v, as a func¬ 
tion v(x) of the other. The acceleration d 2 xjdt 2 can then be calculated 
by means of the formula 

d 2 x __ dv _ dv dx __ dv _ d /r> 2 \ 

dt 2 ~~ dt~~ dx dt~~ dx V ~~ dx\2j 


c 
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so that equation (9) can be written in the form 

d mv z __ dU 
dx (2) dx 


or 


~ + U = const. 

z 


§2 


If mv = g is replaced by dsjdx and the constant is denoted by W, we 
get equation (7). 

We thus see that this equation expresses not only the law of con¬ 
servation of energy, but at the same time the classical law of motion. 
It should be mentioned that both laws are equivalent to one another 
only in the special case which we arc considering here of motion in one 
dimension (see below). 

Another way of interpreting equation (7), or rather the fact implied 

1 d'S 

in it that the velocity v = —— of the particle is determined not as 
J m dx ^ 

a function of the time but as a function of the coordinate x, is to 
replace the single particle under consideration by an infinite number 
of copies of this particle, filling space (or the line x) in a continuous 
way, so that at any instant t a copy is to be found situated at, or rather 
passing through, any point x. This method is similar to one used in 
hydrodynamics except that, in the hydrodynamical case, the copies of 
a particle are replaced by actual particles (supposed to be identical), 
moving under the combined influence of external forces and forces of 
mutual action (represented by the hydrostatic pressure). Provided we 
are not interested in the individuality of the particles, i.e. in the question 
which particle is to be found at a given point, the motion of the particles 
can be specified by defining the velocity of the particle passing through 
each fixed point as a function of the coordinates of this point and, in 
general, of the time. If the velocity does not depend upon the time 
(it should be remembered that the velocity we are speaking of refers 
not to a definite particle but to a definite point) the motion is called 
stationary or steady. 

Thus the picture which can be associated with equation (7) is that 
of an assembly of copies of the particle under consideration, streaming 
steadily and filling space in a continuous way. If we select from this 
assembly a definite copy which at the time t was passing through the 
point x , then, knowing the dependence of the velocity v upon x f we 
can follow its motion and determine both the velocity and position of 
this particular copy as functions of the time. For instance, at the 
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moment i+dt the copy in question will be situated at the point x+v dt, 

and will have the velocity dx) = v(x-\-v dt) = v(x)-\-~v dt, which 

dx 

means that its acceleration is equal to vdv/dx , as was obtained above. 

We have thus shown that the wave-mechanical equation of motion 
actually reduces to the classical equation in the limiting case when the 
wave-length associated with the motion of a particle tends to zero, 
either owing to increase in velocity (which is a thing that can actually 
happen) or to decrease in the constant h (which is an artifice). The 
fundamental reason for this lies in the elimination of partial reflection, 
i.e. of a reversal in the direction of the velocity or, in other words, the 
elimination of the uncertainty in its sign. 

Strictly speaking, however, this uncertainty cannot be eliminated. 
It is impossible to describe the motion of a particle in the classical way, 
i.e. as a determinate change of position and velocity with the time. 
The only way of describing it is to ascertain the probability of finding 
the particle at a given place and the probability that, being at this 
place, it is moving in the one or the other direction (the magnitude of 
the velocity being fixed). This intrusion of the probability conception 
into the description of the motion is necessary because of the ambiguity 
arising from the alternative: partial reflection or partial transmission. 
One could say that this ambiguity—wholly alien to classical mechanics 
—forms the gate through which the concept of probability penetrates 
into the realm of physics. 

The probability of position is measured, as we know, by the product 
ipip*, so that ip(x)ip*(x) dx measures the probability that the particle is 
situated in the region between x and x+dx. Using the picture of an 
assembly of copies of the particle in question filling space (or the z-axis) 
in a continuous way, we can interpret ipip* dx as the relative number 
of copies situated within the interval dx (this number is independent of 
the time so long as ^ — ijj 0 e~ i2nvi , corresponding to a motion with a 

-f OO 

definite total energy W = hv). If the integral J ipip* dx converges, 

— 00 

ip can be normalized in such a way that this integral is equal to 1, in 
agreement with the usual normalization of probability. Otherwise we 
need not worry about this normalization, since after all only relative 
values of ipip* for different points come into account. 

It should be noticed that in the classical description of the motion 
we can also use a continuous assembly of copies instead of an individual 
particle, as is actually done when the equation of motion is written in 
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the form (7) corresponding to the determination of the velocity as a 
function of the coordinate and not of the time. From the point of 
view of this description the difference between the old and the new 
theory can be summed up as follows. In the old theory it is always 
possible to ‘individualize’ a certain copy by following its motion, i.e. by 
determining its coordinate and velocity as definite functions of time, 
whereas in the new theory such ‘individualization’ is impossible, the 
direction of motion being uncertain. It thus becomes necessary to con¬ 
sider the assembly as a wiiole without attempting to disentangle it, 
i.e. to trace the motion of a particular copy in time. This being so, the 
density of the assembly, i.e. the relative number of copies per unit 
range, or, in other w r ords, the probability of finding the particle repre¬ 
sented by these copies in a given range, becomes the primary thing 
that can and must be determined—whereas in classical mechanics it 
remains irrelevant and therefore arbitrary. Of course the determination 
of j/n/f* in wave mechanics is also connected with some arbitrariness, 
which can only be removed by specifying the boundary conditions or 
the conditions at infinity for the function 0. 

Knowing the function 0, one can determine many other things besides 
the probability of position. Thus by means of it we can determine the 
probability of the two opposite directions of motion, that is, of the two 
opposite signs of the velocity, if the magnitude of the velocity is 
assumed to be fixed for a given position by the classical relation 
v — U)/m} or by de Broglie’s relation v = h/(mX). If p' is 

the probability of the positive direction and p" that of the negative 
direction, then the average or probable value of the velocity at a given 
point is given by the formula 

0 = (P’—P')\v\ (10) 

with the condition p'+p" = 1. 

This probable velocity, or the probabilities p , can be determined 
quite generally with the help of the relation (4), as soon as the physical 
meaning of this relation is recognized. We shall first see what the 
expression A 2 d<j>!dx means in the simple case of a wave travelling in 
one direction in a force-free space, that is, a wave representing the free 
motion of a particle in one direction. We have, in this case, according 

to (1 a), 0 = oix and consequently = A 2 a = |0| 2 ^~? = 2 -~ |0|V 

If |0|* is interpreted as the (relative) density of the copies of the 
particle, then the product |0| 2 t> = j must obviously be defined as the 
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corresponding current density , i.e. the (relative) number of copies passing 
through the given point or plane x — const, in the direction of v in 
unit time. If | 2 is interpreted as the probability density, then j can 
be defined as the probability current density, i.e. the probability that 
the particle will cross the plane x — const, in unit time. The ratio 
j/|^| 2 is nothing else than the actual velocity of motion, which, in view 
of the fact that the direction of the motion is perfectly definite, coincides 
with the probable velocity v (p r or p* =* 1). 

It is natural to extend the above interpretation of the expression 
A 2 d<f>/dx as a measure of the current density to any type of wave 
function for from this point of view the fact that A 2 d<f>/dx is constant 
(i.e. independent of x) simply means that the number of copies passing 
through different planes x = x 1 and x — x 2 , say, is the same, just as 
if they were actual indestructible particles. The law expressed by the 
relation (4) would thus be the law of conservation of the number of 
copies or of the conservation of 'probability (see below). If this inter¬ 
pretation is correct, then it must obviously be possible to write j in 

the form . x 

j = (10 a) 

where v denotes the probable velocity of the copies at the point in 

question. Now this is actually the case if j is defined as 

(the coefficient hfeirm is the same as in the special case considered 
above), which gives the following expression for the probable velocity 

A- # 
iirm dx ' 

The ‘phase’ <f> can be expressed in terms of the function i p — Ae^ and 
its conjugate complex ift * = Ae~ %< ^ by means of the formula 

1 


v = 


(10b) 


<t> = ~ lo gW); 


whence it follows that 


. ^ ± (l # 

47riwi \tp dx 

or, according to (8a), 



(10c) 


R(/) denoting the real part of /. In the classical theory this equation 
reduces to 0 = v, in accordance with the fact that the motion proceeds 
in a perfectly definite direction, the probabilities p' and p” being equal 
respectively to I and 0. In the wave-mechanical theory |0| is, in general, 
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different from |v|, the values of the probabilities p' and p* being dif¬ 
ferent from both 1 and 0. They can be determined from v and v by 
means of the formula 



Substituting (10 c) in (10 a), we get the following expression for the 
current density: 


A r/I^Y 

47 rimy dx r dx J 27rm \i r dx) 


(ii) 


We shall now check these results by applying them to two simple cases. 
We shall put first 

iff = A'e^+A'e-™*, 

which corresponds to the free motion of a particle along the or-axis in 
an unspecified direction. 

Assuming the coefficients A for the sake of simplicity to be real (this 
condition does not involve any loss of generality, for it can always be 
satisfied by a suitable choice of the origin x = 0), we have 

= A f e~ iotx +A V" 

whence 

i dx 


— oc(A ,2 ~A f ' 2 )-j-i2<xA'A" sin 2ocx , 
so that j reduces to the constant value, 

or j = \v\(A'*-A"*). 

Unlike j, the probable velocity 


(11a) 


V = —- : 

ijtip* 


\v\- 


A'*-A" 2 


, A' 2 +A'' 2 +2A'A"cob2<xx 
is a function of x, varying periodically between the values 

jA'+A" 

l A'-A" 

>A'-A* 


m- 


and 


v min = \v\ 


A’+A" 


The fact that the maximum value of the probable velocity d turns 
out to be larger than the magnitude of the classical velocity |i>] in¬ 
validates the idea considered above of taking the latter over into the 
wave-mechanical theory as the magnitude of the ‘actual’ velocity. With 
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\v/v\ > 1 formula (10) leads to values of the probabilities p which arc 
devoid of physical meaning, one of them being larger than 1 and the 
other smaller than 0. Although the classical velocity can be determined 
wave-mechanically from the wave-length A (by means of the formula 
|i»| = h/(mX)), yet it is the probable velocity v only which has a direct 
physical significance. 

This is also clearly seen if we take as a second example the case 

ifj = A'e+P*+A*e-P* 

corresponding to a region of total reflection where the kinetic energy is 
negative and the velocity v is imaginary. We have in this case i/r* = ifj, 
j — 0, and v — 0, as might be expected. 


3. Generalization for Non-stationary Motion in Three Dimen¬ 
sions; The Hamilton-Jacobi Equation 

We shall now generalize the results of the preceding section to the 
motion of a particle in three dimensions under the action of forces 
derived from a potential-energy function U which may depend not only 
upon the coordinates y , z, but also upon the time t. 

The wave-mechanical description of such a motion is given by the 
generalized equation of SchrOdingcr 

^-T?(S5S+' ; K- 0 - " 2) 


Our main object will be to trace the relation of this equation to the 
corresponding classical equations of motion, 


d 2 x 

l di 2 


817 
’ dx ’ 


J 2 y 
m - „ 
(it 2 


8U 


dh 




dt 2 


eu 

tz m 


(12a) 


The general character of this relation can be described in a w ay similar 
to that used for the one-dimensional motion discussed above. The 
fundamental characteristics of the wave-mechanical theory can thus be 
partially reduced, as before, to the ambiguity arising from the pheno¬ 
menon of partial reflection and partial transmission—a phenomenon 
which implies a sudden change in the direction of the velocity, its 
magnitude being assumed to be the same function of the coordinates 
as in the classical theory. 

The uncertainty in the direction of the velocity, which in the case 
of one-dimensional motion was equivalent to an ambiguity of sign, is 
now—in the case of motion in space—of a still more distressing 
character. However, we may still expect this uncertainty, as well as 
partial reflection, to vanish in the limiting case of motion corresponding 
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to infinitely short wave-lengths (which can be realized by an increase 
of velocity or of mass, or by a fictitious decrease of the constant h). 
Thus in this limiting case equation (12) must become equivalent to 
equations (12 a) in the sense of admitting particular solutions corre¬ 
sponding to a perfectly definite type of classical motion. 

To demonstrate this equivalence we shall replace the particle under 
consideration by an assembly of copies distributed and moving in space 
like the particles of some continuous fluid (without interaction of 
course!). The velocity vector v of each copy can then be defined— 
according to the classical theory—as a function of the coordinates 
x , y, z of the (fixed) point through which this copy is passing, and of 
the time—the motion being not necessarily a steady one. It should be 
noticed that the partial derivative dv/dt of v with regard to the time 
docs not define the acceleration of a given copy, for it refers to different 
copies passing through the same point at different instants of time t 
and This acceleration can be defined by the total derivative 

d\jdt , its ^-component being thus given by 

dv x ^ dv x dv x dx dr x dy dv x dz 
dt dt dx dt dy dt dz di 


or 


dt dt ' x dx 


r v dy + z dz ' 


(13) 


We shall now assume the motion of the fluid formed by our assembly 
of copies to be irrotational, which means that the velocity vector can 
be represented as the gradient of a scalar function, the so-called ‘velocity 
potential’. We shall denote this function by s/m and put accordingly 

m\ = V*, (13a) 


that is 


1 ds _ 1 ds _I ds 

mdx 5 v m dy’ z mdz 


We make this assumption (which is by no means necessary) not only 
because we desire to simplify the formulation of the classical theory as 
applied to the copy assembly, but also because we wish to establish the 
connexion between this theory and the wave-mechanical theory. We 
have in fact, for a wave propagated in one definite direction, a relation 
exactly similar to (13 a) between the phase <j> and the vector a whose 
direction is the direction of propagation and whose length is 2 tt/A, where 
A is the value of the wave-length at the corresponding point: 

« = Vf 


(14) 
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If we put a — 

h 

(14 a) 

according to de Broglie’s relation, we get 

/ 2tT 

^ = T 3 

(14b) 


as before [cf. (0), § 1]. Thus, by assuming irrotational motion of the 
assembly of copies, it becomes possible to establish a connexion between 
the motion of a particle and the propagation of waves in the limiting 
case of infinitely short waves, i.e. when partial reflection is excluded 
and the motion of every copy of the particle proceeds along a perfectly 
definite path; this path can be considered as the ‘ray’ passing through 
the point at which the copy in question was initially situated. If partial 
reflection does take place the idea of rays loses all meaning, each ray 
branching into two at every point. Only by neglecting reflection can 
one speak of rays as lines along which the waves, i.e. the surfaces of 
constant phase, are propagated. 

Returning to the expression (13) for the ^-component of the accelera¬ 
tion of the copy passing through the point x , y, z, at the instant t we 
can, because of (13 a), rewrite it in the form 



dv 

ii +v * 


8v. 


dv ' 


dv. 




dv T 
since = 


1 d 2 s 


dy m dxdy 


= etc. Therefore 


or 


dx 


dv x 

dt 


iv x __ dv x 
dt “ dt 


_3/V 


(?) 

-ssIs+sH- 


The equation m 


d i* 

dt 


dXJ 

- —, which is the first of the equations (12 a), is 


thus equivalent to 


— [- + -L(V«)*+ffl = 0 . 

dx \_dt ~ 2m v 


Similar results are obtained for the second and the third equations, and 
so all three of them can be replaced by the single equation 

where JP(J) is an arbitrary function of the time alone. This function, 
without loss of generality, can be put equal to zero, for it corresponds 
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to an additive term J F(t) dt in s which is irrelevant for the determina¬ 
tion of the velocity according to (13 a). The function s can thus be 
defined by the equation 

This equation was established by Hamilton and Jacobi and bears 
their name. In the special case when U does not depend upon the time 
explicitly (constant field of force), the function s ~~usually called the 
(mechanical) ‘action’—reduces to 


s = s Q (x } y,z)-Wt, (J5a) 

where «s 0 is determined by the equation 


1 

2m 


y*o\ 2 , /fcoV 

\dxj dy) 


+ 


(z%y 


\dz 


+ U = W. 


(15b) 


Here IT is a constant which can obviously be defined as the energy. 
Thus, in a sense, equation (15 b), in conjunction with the relation 
(13 a), expresses the law of the conservation of energy. However, as we 
have just seen, it expresses much more than that.,f since, in conjunction 
with (13 a), it is equivalent to the three classical equations of motion 
(12 a) for the special case of an invariable field of force and of a fixed 
value of the total energy. The equations (12 a) and (15b)—or more 
generally (15)—are formally different because the former refer to an 
individual particle, while the latter refer to a continuous assembly of 
copies of this particle. If we select a definite copy and follow its motion 
we come back to equations (12 a). 

It can now easily be shown that in the limiting case of infinitely 
small wave-length the wave equation (12) admits particular solutions 
of the form \p = A erf, representing a one-sided propagation of waves 
which can be associated, by means of the relations (14), (14 a), and 
(14 b), with the motion of the particle in question according to the 
classical theory, the different ‘rays’ coinciding with the paths of the 
different copies of this particle. 

Putting tp = A erf, we get in the same way as in § 1 


whence 




VV = 3+^+0 = e^A-A(V<j>f+i(2VA-7<p+AV^)]. 


t Except in the one-dimensional case. 
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We have further 

= a A&+A4*Q. 

dt dt dt 

Substituting these expressions in equation (12), cancelling the common 
factor e^, and separating the real and imaginary parts, we obtain the 
two equations: 

X72 ^ f8'Tr 2 m / h d<f> rT \ /V7J r X2 ]^ __ q 




4 rrm dA 


+ 2 V^-V^+^V 2 ^ = 0. 


If <f> is replaced by — 6 , these equations become 
h 

- 8 ^ v, ' 4 +S+5S' v *>’+ t, = 0 ’ 


2m —4-2Vu4-V^+ylV 2 5 — 0. 
dt 


Putting h = 0 we see that the first of these equations reduces to the 
Hamilton-Jacobi equation (15). The same result is obtained if V 2 A = 0, 
which must obviously express the general condition for one-sided pro¬ 
pagation of waves of finite length. In both cases the wave-mechanical 
theory becomes completely equivalent to the classical theory. Both 
cases are, of course, fictitious, h being a constant and the equation 
VM — 0 being satisfied only under very special conditions—in 
particular for force-free motion. The equation (16) can, however, 
reduc e'approximately to (15) in the case of a nearly one-sided wave 
propagation with a very weak partial reflection—so weak that the 
reflected (or scattered) waves can be neglected. This condition is more 
nearly approached the larger the mass m of the particle for a given 
velocity or the larger the velocity for a given mass, i.e. the smaller the 
wave-length, if we are treating motion corresponding to a constant 
value of the energy W. In the latter case the wave-length becomes 
a definite function of the coordinates. In the general case the idea of 
wave-length has no precise meaning and can be introduced only by 
representing the wave function i/j as a superposition of waves with 
different frequencies, corresponding to motions with different energies. 

If U does not contain the time explicitly, equations (16) and (16 a) 
admit particular solutions of the type 8 = s Q (x,y f z)—Wt and 
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A = A(x, y, z), i.e. dsjdt — —W and dAjdt -- 0 . They therefore reduce to 
h* _ . 1 


V*4 + (Vs 0 ) 2 +17 = 

8tt 2 w 2m v 0/ 


W 


(17) 


and 


2VA^8 0 +AV\ = 0. 


(17a) 


In the limiting case h = 0 the first of these becomes equivalent to the 
classical equation (15 b). 

This equivalence, as well as the approximate equivalence which can 
be obtained in the case of large values of W or m, must not be misunder¬ 
stood. It refers to particular solutions of equations (17) and (17 a), or of 
the corresponding SchrOdinger equation 


Vty + (W-V)<l> = 0 (17b) 

with ip — Ae {2n(8 °~ TVi)!// = tp°(x,y,z)e- i2nW/lh (17 c) 

that is, to solutions which represent — approximately—waves travelling in 
a definite direction (the direction may, of course, vary from point to 
point, being defined by the direction of the 'rays’ passing through these 
points). Now the general solution of (17 b) in the case of short weaves 
can be represented as a superposition of a number of such particular 
solutions corresponding to waves travelling in different directions, under 
the limitations imposed by boundary conditions (in the case of long 
waves this is possible for force-free motion only). The classical equation 
(15 b), on the other hand, does not admit of such superposition for the 
function ip defined as Ae i2irslh . This can clearty be seen in the simple 
case of one-dimensional motion where A is connected with s by the 


relation A 2 


ds 

dx 


C [cf. (4), § 1], so that ip 


jn 

77 j“ 7 jr—\ e i2n8lh . The physical 
<J(ds/dx) 1 J 


reason for this is that 'superposition’ of two different types of motion 
would mean, according to classical mechanics, their ‘simultaneous 
realization’—an obviously impossible thing if they are alternative. In 
wave mechanics, on the contrary, it is just this alternative character 
which is expressed by superposition, the latter corresponding to the 
addition law of the classical probability theory. Similar results apply 
to the general equations (12) and (15), the former allowing the super¬ 
position of processes with different energies if U does not depend upon 
the time—while the latter reduces in this case to equation (15 b) corre¬ 
sponding to one definite value of the energy W. 

The non-validity of the superposition principle in classical mechanics 

can easily be demonstrated with the help of the function S — ~~.log*p 



S3 MOTION IN THREE DIMENSIONS 21 

introduced in § 2 [eq. (8)]. This function satisfies the differential 


equation 


— _ _L (V,S) 2 4- 1/ = 0 

4mm r SI 2m' ' 


( 18 ) 


which is obtained from SchrOdinger’s equation (12) by the substitution 
0 = e izrrsih an( j w hj c h reduces to the Hamilton-Jacobi equation (15) if 
h is put equal to zero. The function S thus coincides in this case 
with the function s, which means that the amplitude A can be con¬ 
sidered as practically constant. 

Now if in the Hamilton-Jacobi equation (15) we put s — S — y—Jog0, 
we get the following ‘approximate’ equation for 0: 


h . ddj 

Zm+Tt 


h 2 

87r 2 m 


(V0) 2 +£70 2 


0 


or A --UU\>I> 2 = 0, (18a) 

' r ' h 2 \4m St ' J 

which is quadratic and of the first order (like the equation for S) instead 
of being linear and of the second order like the exact equation of 
SchrOdinger. If 0j and 0 2 are two particular solutions of (18 a), the 
function 0 = 0x+0 2 will not in general represent a solution of this 
equation. 

Returning to the representation of the exact wave function in the 
form Ae't ■-= Ae i2n8 ' h , and considering equation (16 a) connecting A and 
8, which has been disregarded hitherto, we see that this equation can 

dA dA~ 

be simplified if multiplied by A. We have in fact 2^4— = —- and 


2AVA-V8+A 2 V~s - ViA^s+A^s = div(^ 2 V t ^); 


so that 



(19) 


This equation is of the same form as the equation of continuity, i.c. the 
equation of the conservation of mass in hydrodynamics or of the con¬ 
servation of electricity in electrodynamics, 


+ div j = 0, 
ot 


where p is the density of mass or electrical charge and j the corre¬ 
sponding current density. In the present case we can interpret the 


quantity 


A 2 = 00* = p 


as the density of the copy assembly (i.e. the relative number of copies 
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of the given particle in unit volume) or the density of probability. If, 
further, we define the corresponding current density by the formula 

j = -A*Vs, (19a) 

m 

then equation (19) will express the law of the conservation of the copies 
or of the probability. In the classical theory the vector Vs/m reduces 
to the actual velocity v of the particle (or more exactly of its copies 
at the given point), so that j assumes the usual form of the product 
of p with v. In the exact wave-mechanical theory it can also be written 
in the form 


where the vector 


m 


Vs 


(19b) 


must obviously be interpreted as the probable velocity. The classical 
velocity can be computed as usual by means of the formula 


•’-yo-4 


its direction being, however, uncertain. According to the definition of 
A and a, we have ip = Ae iZn8lh , ip* = Ae~ i2n8lh , whence 

* “ Si 1 ”*?' 

and consequently 

I - = 5 < 20) 

Introducing the function S — —Aogip, wc get - V</i and 

2tti 2tti ifj 

so that 

i h 


and 


j = ~i/fi/r*R(ViS’) 
m 

V = — R(V<S). 
m 


(20 a) 

(20 b) 


Comparing this with (19 b), we see that the function s is equal to the 

real part of S , in accordance with the relation S = 5 + -—.log A which 

2tti 

results from comparing the two expressions e i2nSlh and A& i2n8lh for ip. 
The probable velocity (20b) could be represented in the form 

v = |v| J np(n) dco , 
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where n is the unit vector which defines the direction of the classical 
velocity and p(n) dm is the probability that this unit vector lies in the 
infinitely small solid angle doj. An unambiguous determination of this 
probability appears, however, to be impossible, except for one-dimen¬ 
sional motion considered in the preceding section. This is quite natural 
if we remember that the notion of classical velocity, as measured by 
the time derivative of the coordinates, cannot be taken over into wave 
mechanics. 

It should be mentioned in conclusion that the relation between wave 
mechanics and classical mechanics is usually compared with the relation 
between wave optics and the so-called geometrical optics, the latter 
being defined as the limiting case of wave optics for very small wave¬ 
lengths. This statement would, however, be misleading unless we add 
to it that in geometrical optics partial reflection of light (which actually 
decreases with decrease of wave-length) should be wholly left out of 
account—even in its simplest form on the boundary surface between 
two homogeneous media. In this case—and only in this case—is it pos¬ 
sible to introduce the idea of rays as lines along which the propagation 
of light takes place (this is why geometrical optics is often called ‘ray 
optics’ in contradistinction to wave optics, where the idea of ‘rays’ has 
in general no meaning). It was the merit of Hamilton to show, one 
hundred years ago, that in this limiting case the wave conception of 
light can be replaced by the corpuscular conception, and that the rays 
can be described as the paths of light particles moving, according to 
Newton’s classical law, in a certain field of force. The potential energy 
of this field of force U is determined by the refractive index /x according 
to the relation ^2 __ 

where y is a constant depending upon the definition of the mass of 
a light particle.t But perhaps the main merit of Hamilton’s work was 
that he applied the same considerations to the motion of particles of 
ordinary matter, thus for the first time associating such motion with the 
j propagation of (infinitely short) waves and describing it by equation (15). 
This association of particles with waves, which in Hamilton’s theory 
was achieved by interpreting the ‘mechanical action’ s as a measure 
of the phase function <£, was, however, completely forgotten for 
a hundred years, until de Broglie rediscovered it in the way described 

t This relation is obtained in the simplest way by comparing de Broglie’s formula 
for the wave-length 1/A ™ *j{2m(W~ U)}jh with the formula A 0 /A which can be 
considered as the definition of the refractive index, A 0 being the value of A in vacuo, 
i.e. for a place where p = 1. 
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in Part I, and SchrOdinger introduced his wave equation, whose relation 
to the Hamilton-Jacobi equation has been discussed above. 

This mutual reaction of optics and mechanics must not be misinter¬ 
preted as an indication of a true analogy between them—in the sense 
of a wave-corpuscular duality of light. We must not be led by it to 
infer the real existence of photons, moving in material bodies according 
to the laws of wave mechanics. For we could replace optics by acoustics, 
i.e. light vibrations by mechanical vibrations propagated in the form 
of waves in elastic media according to an equation of exactly the same 
kind as the differential equation for the light waves. In the limiting 
case of infinitely short acoustical waves we could therefore obtain 
exactly the same results as in optics, i.e. a kind of ‘ray acoustics’ 
instead of a ‘wave acoustics’. This would enable one to formulate a 
corpuscular theory of sound and describe the propagation of sound as 
the motion, according to wave mechanics, of certain particles—e.g. 
‘phonons’. I do not think, however, that anybody would believe in the 
reality of such ‘phonons’. This does not mean, of course, that the 
photons are equally unreal, for the analogy between acoustics and optics 
is just as superficial as that between optics and mechanics (or acoustics 
and the mechanics of single particles).—I am inclined, however, to 
think that photons have no more reality than ‘phonons’, and that they 
are created by a ‘reflection’, as it were, of the wave-corpuscular duality 
of matter in the phenomena of light (cf. Part I). 

4. Comparison of the Approximate Solutions of Schrddinger’s 

Equation; Comparison of Classical and Wave-mechanical 

Average Values 

Although in the case h — 0 the functions s and S satisfy the same 
equation—namely, that of Hamilton and Jacobi—yet the approximate 
expressions for i/j obtained therefrom, according to the formulae 
ip = Ae i2n8 l h and ip = e i2rrSfh , turn out to be somewhat different, for the 
‘amplitude’ A obtained by means of equation (16 a) is in general a 
certain function of the coordinates (and the time), varying very slowly 
compared with the ‘phase factor’ 2nsjh. 

The discrepancy between the two approximate solutions is due to 
the fact that the error introduced by putting h — 0 is larger in the 
case of equation (18), which contains h in the first power, than in 
the case of equations (16) and (16 a), where h appears in the second 
power. In the latter case we thus drop a small term of the second order, 
while in the former case we drop a much larger term of the first order. 
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In order to remove this discrepancy we must put 

S = s° +—.S', (21) 

27 Tl 


and after substituting this expression in equation (18) drop terms which 
are quadratic in h but keep those which are linear in h (S° and S' being 
independent of h and therefore of the same order of magnitude). We 
thus get the approximate equation 


Jt— V 2 /S'°+— + ~ — + JL ( VA °) 2 -t- 

47 nm dt 2 tti dt 2 m 


—.V,S'° VA”-f U = 0. (21 a) 
2rrmi 


Here S° must be regarded as the zero approximation, corresponding 
to h — 0, i.e. as the solution of the Hamilton-Jacobi equation 

f+2iW+P - °- 


It can obviously be identified with the (approximate) function s. 

The function S' must therefore satisfy the equation 

1 V 2 £°+ — +1VS°-V<S' = 0, (21b) 

2m dt m 


whence it follows that S' is a real quantity. Now according to (21) we 
have i p = e i2rrSlh = so that, since S° = s, e must be equal 

to A. Substituting in (21 b) 

S'= log .4, (21 c) 

we do indeed get equation (16 a). It may seem that by developing the 
function S in a series of powers of the parameter kj(27ri) 

s = ^+« s ' + (is) V+ - 

and solving the equation (18) by successive approximations, one can 
obtain as good an approximation for S as may be desired. This assump¬ 
tion is, however, incorrect, for it can be shown that the preceding series 
is divergent or rather semi-convergent, which explains why one gets 
a closer approximation by keeping the first-order term, as has been 
done above. In fact the general solution of a differential equa¬ 
tion of the second order cannot be approximated to by starting 
with the solution of the equation of the first order obtained by 
dropping the second-order terms, however small the parameter by 
which they are multiplied may be, just as a quadratic equation cannot 
be approximated to by the linear one obtained by dropping the quadratic 
term. If, however, the latter is multiplied by a small parameter, then 

3695.6 E 
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one of the two solutions of the quadratic equation can be approxi¬ 
mated to by the solution of the linear one. A similar relationship exists 
between the function ip — e 7 ’ 27r ' s '/ /tf,s ” and one of the particular solu¬ 
tions of Schrttdinger’s equation, representing approximately waves 
travelling in one direction. It should be mentioned that this direction 
need not remain constant; it can be changed by total reflection, which, 
in contradistinction to partial reflection, is a phenomenon perfectly 
compatible with classical mechanics since it does not involve any 
ambiguity and therefore does not challenge a deterministic description 
of the motion. The difference between classical mechanics and wave 
mechanics in the approximate form given above, in so far as total 
reflection is concerned, consists only in the fact that, according to the 
latter, the particle can penetrate into those regions of the field of force 
where its ‘ classical ’ velocity becomes imaginary. 

According to the relation v = Vs/m = V$°/m, it should follow that 
the functions s and S° must also become imaginary. So far as S° is 
concerned this is perfectly true. The function s, however, according to 
its definition, must remain real. It will therefore be different from S° 
for those regions where v is imaginary and will satisfy an equation 
different from that of Hamilton and Jacobi. We must remember that 
equations (16) and (16 a) were obtained on the assumption that both 
8 and A were real. The assumption that s satisfies approximately the 
Hamilton-Jacobi equation, even when the latter gives imaginary values 
for it, would thus imply a contradiction. 

This means that, in the case under consideration, V 2 A must be very 
large and of the order of magnitude of 1 /h 2 , so that the first term in 
equation (16) or (17), which when omitted reduces (16) or (17) to the 
Hamilton-Jacobi equation, cannot be dropped. We shall not consider 
the approximate solution of equations (16) or (16 a) [or (17) and (17 a)] 
for this case. It is simpler to use instead the alternative representation 
of iff by means of the function S = S°+S'h/(2Tri) since we do not have 
to worry about the reality of S° . An imaginary value of S° leads, 
according to (21 b), to an imaginary value of S'. The role of the func¬ 
tions S° and S' as determining the phase and the amplitude respectively 
will thus be reversed for classically forbidden regions, so that, using 
the expression Ae i2n8 l h for iffy we can put 

gi27rS*lh __ g±2ff|5°|/A 

-5S s '- ± S |S ' 1 - 


( 22 ) 

(22a) 



§4 APPROX1MATR SOLUTIONS OF SCHRODINGKR’S EQUATION tl 
The sign (+ or —) is determined by the condition that A (i.e. i ft) must 
decrease with increased penetration into the forbidden region. It can 
easily be proved directly that the expressions (22) and (22 a) constitute 
an approximate solution of the equations (10) and (1C a) for the case 
in question if the functions S° and S' are determined respectively by 
the Hamilton-Jacobi equation and by equation (21 b). 

Returning to the case when S° is real (and equal to s), corresponding 
to the motion in the classically allowed region of the field of force, let 
us examine the approximate values which are obtained for the ampli¬ 
tude A = e 8 \ 

We shall first consider the simplest case of a one-dimensional motion 
with constant energy. We have in this case, according to (4), 

A 2 -- -- const., 
as 

that is, since ~ = i\ 

dx 



(23) 


where C 2 denotes a positive number. We thus get approximately 

* = £ A (23 a) 

vM 

a 0 (a:) being a solution of the equation 


1 

2m 



+ U = \V, 


Formula (23) has a very simple physical meaning. It shows that the 
probability of finding the particle within a certain region between x and 
x+dx is inversely proportional to its velocity in this region. This is 
just what we should expect if this probability were defined as propor¬ 
tional to the time dt = dx/v which the particle spends in the region 
in question. We thus see that the interpretation of the quantity 
i/n/t* dx — A 2 dx as the relative probability of finding the particle in 
the region dx is in agreement, so far as the approximate expression for 
0 is used, with the classical definition of probability in terms of 
duration. 

If f(x) is some quantity depending upon the position of the particle, 
and if the motion of the latter is confined to a limited region of the 
s-axis, e.g. between x 1 and x 2 , then the average value of this quantity 
in the sense of classical mechanics, i.e. with respect to the time, can 
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be defined by the expression 

r 

(24) 

0 


taken for a 'round trip’ of the particle, T representing the duration of 
this round trip. The round trip can obviously be replaced by a one-way 
trip, since the motion must proceed in the same manner on the two 
halves of a round trip, with the sign of the velocity reversed. We can 
thus put tt 


/— • J/w*. 




where l x and L> denote the time of starting from the point x x and 
arriving at the point x 2 respectively. Replacing dt by dxjv , where v is 

a function of x determined by the equation v 2 = — C7(a;)}|, we 

get 

i /* tt~\ 

(24 a) 




/(*) 

M 


dXy 


or, if a ‘round trip’ is taken instead of a ‘one-way’ trip, 

;/(*), 


/=!<LM 

TJ v 


dXy 


the velocity v being taken with the same sign as dx (i.e. + when x is 
increasing from x x to x 2 , and — when it is decreasing from x 2 to x x ). 

Now the expression (24 a) for / is identical with that obtained by 
means of the wave-mechanical definition of the average value of f(x) 
according to the formula 

/ = (24 b) 


if the function 0 is assumed to vanish outside the region (x v x 2 ) 
and is replaced by its approximate expression (23 a) for this region. 
The normalization constant C must be determined by the condition 

) i/jtft* dx = 1, that is, 

C 2 C 2 j dt = 0%-lt) = 1. 

*1 ti 

This agreement of the classical theory with the wave-mechanical theory 
must hot be overestimated. As a matter of fact the function i/j does 
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not in general vanish outside the classically allowed region, but, as we 
have just seen, decreases there approximately as c~ 2nls * llh . According 

to the relation v = ~ ~ , we can put (dropping the term containing 


the time) 


S° = m j v dx = J yj{2m(W — U)} dx. 


(25) 


This formula applies just as w r ell, i.e. with the same degree of approxima¬ 
tion, to the points inside and outside the region (x v x 2 ). ^ ie latter 
case, for a point x > x 2 , we can put 


X 

| S°(x) I = J V{2w(f7- IF)} dx, (25 a) 

T a 

X 

f 42m(U-irn<t.r 

and consequently ]0] = Ce h x, (2ob) 

Thus, to the degree of approximation used, wc should define the wave- 
mechanical average of f(x) by the equation 

/= / JWl^dx 


with 

for W ^ U, i.e. for x x ^ x ^ x 2 , 


-t- f vt2w(u-rr)j dx 

and 1 2 = C 2 e h l 

for x > Xo and a similar expression for x < x v 

4-co 

be determined from the equation j |0| 2 dx = 1. 


The constant C must 


The difference between the classical and the wave-mechanical aver¬ 
ages becomes particularly important when there are two or more 
classically allowed regions separated from one another by regions for 
which W < U. The latter, being permeable to the particle from the 
wave-mechanical point of view, do not actually separate but, on the 
contrary, connect the former regions. 

The comparison of the classical 'time-average’ with the w r ave- 
mechanical 'probable value’ for the case of a three-dimensional motion 
is much more complicated than in the one-dimensional case and will be 
considered in the next section in connexion with the w r ave-mechanical 
interpretation of the quantum conditions. It must be remarked here 
that such averages or probable values have a meaning only when the 
motion is confined to a classically limited region, and that these limits 
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can be assigned a priori only in the case of a conservative motion, 
i.e. a motion with a given (constant) value of the energy W. Within 
the allowed region, limited by the surface W — U — 0, the amplitude 
function A must satisfy the equation 


div(A 2 V«s 0 ) = 0, 

which can be solved after the function s 0 has been determined from 
the Hamilton-Jacobi equation (17). It should be remembered that this 
equation, which represents another form of equation (17 a), expresses 
the law of the conservation of the copies of the particle, or of the 
probability of its location [cf. (19)]. 

Although there is in general no exact equivalence between the 
classical and the wave-mechanical average values, yet there are special 
eases when this equivalence turns out to be exact. An interesting case 
of this sort is provided by the so-called ‘virial’, i.e. by the quantity 


T r dU , eu , dU 
v = - — x + - y + 
dx dy dz 


which was introduced by Clausius in the kinetic theory of gases. 

For a motion restricted to a limited region, the time average of this 
quantity V is connected with the time average of the kinetic energy 
by the relation 2T = V (26) 


This is called the ‘virial theorem'. It can be derived as follows: We 
multiply Newton’s equations of motion 


m k 


d 2 * k 
dt 2 


eu 

dx ’ 


etc., 


by the corresponding coordinates and write 

d*x k . _ d / dx k \ idxrf 
X *W ~dt\ Xk dt]~\dt) ■ 

Adding these transformed equations, we get 

s?"*(*4*+•••)- ?”*[(t j ) ,+ -] “ - ?(!**+■■■■)■ 

Formula (26) is then obtained by averaging with respect to the time 
and taking account of the fact that the mean value of 



vanishes. If we replace the kinetic energy T by the difference W—U 
and assume that the potential energy is a homogeneous function of the 
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wth degree in the coordinates, formula (26) reduces 

9 


2(W-U) = nU or 


U 


n+ 2 


W. 


31 

to the form 
(26 a) 


It can easily be shown that this relation remains exactly valid in wave 
mechanics if U is defined as the integral J Uipip* dV and ip is defined 
as the exact solution of the corresponding Schrodinger equation. As 
an example we shall consider the simplest case of a one-dimensional 
wave-mechanical problem which is described by the equation 


d 2 ip Hnhn 
~dx 2 + W 


( W—U)\jj - 0. 


If we multiply this equation by x dip*/dx and the conjugate equation 
Obyx^ and add, we obtain 


A 2 


dx 


d /dip di/j*\ 87 rhn T . r d .. 87r 2 ra 7T d /; /Jk . 

x di[£ X!' *i-W) = »• 

By partial integration with respect to x, taking into account the 
boundary conditions (ip = 0 and dipjdx = 0 for x = ± 00 ), we get 

_ r f d pdx - ^: n w f#*=o, 

J dx dx h 1 J h 2 J dx 

— 00 — QO —30 

+ 00 -FCC ^ 

or, since J j/n/r* dx = 1 and J fip*p* dx — j , 

- OO — CO 

f =0. 

J dx dx A 2 [ dx J 

—00 

Further, by multiplying the Schrftdinger equation by tp*, we obtain 
J dx + -p- J (IF- U)H>* dx — 0, 


i.e. 


+00 

J' 


T h* 


(W-U) = 0, 


or, transforming the first term by partial integration, 
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We have therefore W-U+W- d ^ X) = 0 

dx 


or 2(W-U) = x dU . 

ctx 

This is exactly formula (26) for the special case that we have con¬ 
sidered. f 

Another illustration of the connexion between the wave-mechanical 
and the classical theory is given by the similarity of the classical equa¬ 
tions of motion, 79 oTT 


and the wave-mechanical relations 


dU . 

- , etc., 

dx 


between the corresponding average (or probable) values of the quantities 
involved. 

The relations (27) were found by P. Ehrenfest. They are usually 
referred to, in connexion with the propagation of a wave packet, as the 
equations of motion of the ‘centre’ or ‘centroid’ of the latter, that is, 
of the point with the coordinates 

x = J xiptp* aV, y = J yi/jijj* dVy z — J zipifj* dV. (27 a) 

If the wave function ifi represents a wave packet formed by superposing 
waves with slightly different frequencies (i.e. motions with slightly 
different energies), the coordinates x, y, z are certain functions of the 
time (in the case of a stationary state where the dependence of \p upon 
the time is specified by the factor e~ iZ7rvt they reduce to constants), so 
that we can differentiate them with regard to the time. The corre¬ 
sponding quantities can be defined as the average values of the com¬ 
ponents of the velocity of the particle or its acceleration, etc. 

We shall prove the relations (27) for the simplest case of a motion 
parallel to the ar-axis (the proof can easily be extended to the case of 
three-dimensional motion). We have, by the definition of x f 



— 00 —00 


since x and t are independent variables. 


t The proof given is due to B. Finkelstein. 
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Now ip and \p* satisfy the equations 
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1 

4 ctt m\dx 2 F 7 


dt 

, 8 rrhn TT 

where \l = * Hence 


dip 

a* : 


dip* __A- /av* 


47 r m \ &c 2 




dx 

dt 


ih 

4mn t 


— 00 

By partial integration, in conjunction with the fact that 


+« 

/ 


^dx — /(+oo)—/(—oo) 


vanishes if the function / contains a: or dipjdx as a factor (since j* iptp* dx 

— 00 

must be finite and equal to 1), we obtain 

= _*L f L* 8 l-t 8 +*) dx. (27b) 

±Trmi J y dx r dx) K ' 


dx 

dt 


This expression could be obtained directly from the relation 

+ = 0 (which is a special case of (19)) and the formula 

ct v 


dx 


j = for the current density. Putting j == ipip*v(x) y 

47 Tim\ dx dx } 

where ti(x) is the average velocity at the point x y we can rewrite the 
preceding equation in the form 

+ 00 

^ = J V(x)<pip* dx, 

— OO 

which agrees with the definition of dxjdt as the average value of the 
velocity of the particle irrespective of its position. 

By differentiating (27 b) with respect to the time, we obtain 

*? = —*L_ Tdx\( 8 ^-y.urY^-r-{ 8 --^uA- 

dt* 167T 2 m 2 J [\ ^ * J dx V 8x\8x* ^ 7 


+ 
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i.e. 


where 


d 2 £ t)C7 

m - „ = — . 

eft 2 to 


cto J to 


is the average (or probable) value of the force acting on the particle. 
It must be emphasized that this value refers not to the average 
(or probable) position of the particle, determined by the centre of the 
packet (otherwise this centre would move exactly according to the 
classical mechanics), but to all possible positions. 

If the dimensions of the packet are very small (which means that 
the uncertainty in the estimation of the particle’s velocity is very large) 
the motion of its centre closely follows classical motion. This, however, 
persists only for a very short time, for the packet will spread, the rate 
of this spreading being the larger the smaller its original dimensions 
(i.e. the larger the original uncertainty in the velocity). 


5. Motion in a Limited Region; Quantum Conditions and Aver¬ 
age Values 

We shall now investigate the case of a (three-dimensional) motion 
restricted classically to a finite region of space (where W — U > 0), and 
derive the ‘quantization rules’ characteristic of such a motion with the 
help of the approximate wave-mechanical theory based on the classical 
determination of the phase or action function «(— $°) by means of the 
Hamilton-Jacobi equation. A motion of this kind must obviously have 
a periodic or quasi-periodic character, so that the path described by 
the particle may fill up the whole region or pass many times in various 
directions through the same or nearly the same point (as, for instance, 
in the simple case of the oscillatory motion of a particle along a straight 
line). If the particle is replaced by a continuous assembly of its copies, 
a rather complicated picture results, different copies passing simul- 
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taneously through the same point with velocities which are in general 
different both in regard to direction and (if the field of force varies 
with the time) in regard to magnitude. The latter must, of course, 
remain a single-valued function of the coordinates in the case of motion 
with a given (constant) value of the total energy W. The function 
<f> = s/m, which can be defined as the velocity potential, must, however, 
in this case (as well as in the general case of non-conservative motion) 
be a multiple-valued function of the coordinates. Considering the copy 
assembly as a kind of fluid, we can illustrate the case in question by 
the familiar type of fluid motion with closed stream-lines, each stream¬ 
line representing the path of all the particles situated on it. In the 
associated wave picture these closed paths of the separate particles or 
copies must be interpreted as closed rays . 

Now a fluid motion of this cype can be irrotational if, for instance, 
the fluid is flowing in a closed tube or around some closed tube. The 
velocity v of the particles, as a function of their coordinates, can then 
be represented as the gradient of a potential <f>, provided the latter is 
defined as a multiple-valued function of the coordinates. In fact, taking 
the integral of the velocity along a line a connecting two points P x and 
jP 2 , then, since the projection v a of v on the line element do is, by 
definition, equal to d(f)jdo , we get 

r\ 

j v.da^MPJ-MPi). 

If the line is closed, i.e. if the points P x and P 2 coincide, this integral 
should be equal to zero, irrespective of the shape of the line, unless we 
assume that for closed lines of certain type the potential (f> may change 
after a ‘round trip’ by an amount A <f> equal to the value of the integral 
v a do taken along the corresponding closed line. If the latter coincides 
with a stream-line, the integral will certainly be different from zero, 
since along this line we must have v a = \v\. 

Now it can easily be proved that in the case of irrotational motion 
the integral f v a do, which is called the ‘circulation', will have the same 
value for all closed lines of the same family , i.e. of the same general type. 
In the case of a fluid flowing around a closed tube along closed stream¬ 
lines (Fig. 1), we must distinguish closed lines of two families: those 
which do not surround the tube, and those which do. For the former 
the circulation will be equal to zero, while for the latter it will have 
a certain value different from zero. This result follows from the trans¬ 
formation of the line integral § v a do, by means of Stokes’s formula, 
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into the integral § (curlv)„ dS over any surface S limited by the line a. 
In the case of the lines of the first family the surface S will be situated 
entirely within the fluid, so that the integral will vanish, since the 
motion is supposed to be irrotational (curlv — 0). In the case of the 
lines of the second family the surface S will cut the tube around which 
the fluid is flowing. Since for points inside the tube the idea of velocity 
has no meaning, we can replace the surface S by another surface S' 
bounded by two closed lines of the second family. Stokes’s formula 
applied to this surface which lies wholly within the fluid, and for which 
therefore the integral j> (curlv) n d^ vanishes, leads to the result that 


CT 


Fig. 1 

the integral § v a da taken over the double boundary of S' must vanish 
if the ‘round trip’ is made in opposite directions along the two con¬ 
stituent lines, whence it follows that the circulation will have the same 
value for both lines if the round trip is made in the same direction. 

It may be mentioned that exactly similar results are met with in the 
theory of the magnetic field generated by a linear electric current. This 
field—outside the wire along which the current is flowing—is also 
irrotational, so that the magnetic field strength can be defined as the 
gradient of a certain magnetic potential. With every trip around the 
wire along any closed line (encircling this wire only once) this potential 
must change by a definite value, namely im, where i is the strength 
of the current. 

The preceding results can be applied without substantial modification 
to the flow of the fictitious fluid represented by the copy assembly of 
a particle moving in a limited region. In the copy assembly, however, 
we must remember that different copies may be imagined to pass 
simultaneously through the same point in different directions. This is, 
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of course, impossible in the case of real particles. In particular, closed 
stream-lines may degenerate into 'double lines’, i.e. unclosed lines along 
which the copies move first in one and then in the opposite direction 
(oscillatory motion).f The ‘circulation’ §v a dor for such a double line 
will not be equal to zero, but, on the contrary, will be equal to double 
the value of the integral J v a da for a one-way trip. As a result the 
velocity potential <j> = s/m ,, in addition to the multiplicity considered 
above, may acquire a duplicity of an entirely different character, corre¬ 
sponding to the possible presence at each point of two copies moving 
in opposite or, in general, in different directions. 

Leaving aside this duplicity we see that, in the case of a particle 
confined to a finite region of space, the function s representing the 
mechanical action or the momentum-potential of the copies of this 
particle must—so long as the motion of these copies is supposed to be 
irrotational—be a multiple-valued function of the coordinates, i.e. it 
must change by a certain amount As for all closed lines (including 
double lines) of a certain family. It should be mentioned that ‘round 
trips’ along any of these lines have nothing to do with the actual 
motion, being performed not by definite copies (the latter need not 
move in closed lines), but by the process of linear integration referring 
to a definite instant of time. The change A s of the function .9 for any 
such round trip is called a ‘periodicity modulus’ of s. From the point 
of view of the wave picture associated with the motion of the copy 
assembly of the particle these ‘periodicity moduli’ divided by the con¬ 
stant h represent the number of wave-lengths contained in the corre¬ 
sponding closed lines. In fact ds/da = g a is the component of the 
momentum of the particle along the line-element da and according to 
de Broglie’s relation d(s/h)/da — gjh = Jc a must be equal to the corre¬ 
sponding component of the ‘wave-number vector’ k = g//i of the 
associated waves. The integral § k a da = As/h may therefore be defined 
as the number of wave-lengths contained in the line a, or, more exactly, 
as the number of wave-crests cut by this line, or still more exactly, as 
the difference between the number of waves cut by a in the positive 
and in the negative direction (i.e. in the direction of propagation and in 
the opposite direction). 

Now it is clear that in the case of motion corresponding to a definite 
energy, the wave system associated with it must be such that the 
number of waves cut by any closed line should be integral , corresponding 

■f The tube around which the fluid in supposed to flow degenerating into a ribbon 
with zero thickness. 
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to a change of the phase (f> = 2ns/h by an integral multiple of 2n, 
a change which is irrelevant for the value of the wave function — Ae 
In the contrary case the latter would also be a multiple-valued function 
of the coordinates, and would not represent a stationary system of 
standing waves (each standing wave being produced by the super¬ 
position of waves travelling in different directions), determined by the 
condition that the wave function i f$ should vanish at or near the 
boundary of the region where the particle is supposed to move. 

It thus follows from the condition of single-valuedness for the wave 
function ip that the ‘periodicity moduli of the ‘action function ’ s must 
he integral multiples of h . 

This condition, which—it should be remembered—refers to the case 
of motion confined to a (classically) limited region, can easily be shown 
to be equivalent to the quantum conditions of the old quantum theory 
discovered by Bohr and by Sommerfeld. 

For the general formulation of these quantum conditions, it is 
necessary, instead of the original rectangular coordinates x , y, z, to 
introduce new variables (generalized coordinates) q v g 2 , q 3 . If we suc¬ 
ceed in so choosing these new variables that s assumes the form 

* = 2 »«(*«) ( 28 ) 

tt^l 


(‘separation variables’), then the quantum conditions run as follows: 




= (&?)« = nji 


(n a an integer). 


(28 a) 


Here the various p a (= dsjdqf) are the ‘generalized momenta’ and 
(As) a are the ‘principal moduli of periodicity’ of the function s, i.e. those 
alterations of this function which correspond to a ‘cyclic’ change of 
one of the separation coordinates when the remaining two are kept 
fixed. By a ‘cyclic’ change of the coordinate q a we mean an altera¬ 
tion such that the given particle returns to its original position and 
therefore the rectangular coordinates assume their original values. If 
the coordinate q a has the character of an angle so that the rectangular 
coordinates are periodic functions of it, then the ‘cyclic change’ of q a 
is simply the increase by the corresponding period A q a (for example, 
2n). Otherwise it is an oscillation of q a within certain limits determined 
by the nature of the field of force. The cyclic alterations of the in¬ 
dividual separation coordinates in the actual motion of the system 
take place in periods of time A£ a which are in general different from 
one another, so that the motion with regard to the time appears to be 
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non-periodic or conditionally periodic. This dependence of the variables 
q a on the time plays no part in the 'quantizing’ defined by formula (28 a). 

The generalized momenta appearing in (28 a) can be defined, and 
indeed are usually defined, in a different way—namely, as the partial 
derivatives of the kinetic energy 7\ expressed as a function of the 
generalized coordinates and of the corresponding 'velocities’ dqjdt = q a , 
with respect to the latter. The equivalence of both definitions is obvious 
in the case of rectangular coordinates, since T = $m(v 2 x +vl+vl) and 
9x ~ ds/dx — dTjdv x , etc. If the coordinates are replaced by new 
(generalized) coordinates < 7 a (a;, ?/, z), we have 


whence dqjdv, 

ds 

c)X 


a — v 4 - A-^L n v 
dx x+ dy v ' + dz 

-- dqjdx, etc. We thus get 


y ds 

fin fix’ 

0L~i * a 


dT 

dv„ 


V ST bq„ 
" d( Y Sv x 


y cT dq ?> 

£-• da,* dx 9 

a-1 


and consequently, —- — - • -- p a . 

dQa dq^ 

The formulation of the quantum conditions in the form (28 a) is some¬ 
times possible in two or more different ways—if there exist several 
sets of ‘separable’ coordinates. Theoretically it is possible—in a single 
way at least—for any type of motion (restricted to a finite region). 
Practically, however, the 'separation coordinates’ can be found only 
for simple types of motion (i.e. of the field of force). If the separation 
coordinates cannot be found, then the quantum conditions—in the sense 
of Bohr’s theory—must be stated in the more general form indicated 
above, namely, that the moduli of periodicity of s with respect to any 
closed curve should be equal to an integral multiple of h (or to zero). 

We shall now turn to the question of the relation between the wave- 
mechanical average or probable value of any function of the coordinates 
of the particle for a given quantized state of motion and the corre¬ 
sponding classical 'time average’ of this function. The solution of this 
question depends upon the introduction of new coordinates of a still 
more general kind than those considered above in connexion with the 
formulation of the quantum conditions. These still more general co¬ 
ordinates are not directly expressible in terms of the original ones, but 
in terms of the original coordinates and the corresponding momenta, 
the new momenta being also functions of the old momenta and of the 
old coordinates. 

Coordinate or rather coordinate-momenta transformations of this 
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type were introduced by Hamilton and are called contact or canonical 
transformations (the transformation considered above being a particular 
case of these transformations). 

The theory of canonical transformations is based upon the preserva¬ 
tion of the so-called ‘canonical form* of the classical equations of 
motion. In the case of rectangular coordinates these canonical equa¬ 
tions can be obtained directly from the usual equations of motion 
md 2 x/dt 2 — — dU/dx , etc., and have the form 



dg T 

dt 

dH dx dH 

dx ’ ’ dt dg x * 

(29) 

where 

H = 

= g ~(9l+gl+yl)+u 

(29 a) 


is the total energy expressed as a function of the coordinates and 
momenta, and is usually denoted as the ‘Hamiltonian function’. The 
equations (29) can be interpreted as referring to a particle moving not 
in ordinary space with the three coordinates x , y, z but in the six¬ 
dimensional pha^e-space (Part I, Chap. V) with the ‘coordinates’ x, y , 2 , 
g x , g y , g z , the time derivatives of these coordinates representing the six 
components of the ‘velocity’ in phase-space and // being a function of 
the ‘position’ of the particle in the phase-space.f 

For the sake of uniformity in notation we shall, in the following, 
instead of x , y, 2 write Q v Q 2 , Q s , and instead of g XJ g v , g z write P v P 2 , P 3 . 
The equations (29) then become 


dP a _ _8H dQ a _ dH 

dt dQa dt ~ dP { 7 X ' 


(29b) 


We now introduce new coordinates Q 2 , Q 3 determined by three 


equations of the form 


Q'p — QpiQi’ Qz) or Qa — QaiQi’ Qz> Qz) ( a >/^ ~ i» 2,3). (30) 
We then define the new momenta P' v P 2 , P' by the formulae 

P ' _ _ V - X p 8 A« or P - y (30 

' £ *>. >Qi - fi P ‘W, * - A P hQ„- (30 *> 


which obviously do not assume a knowledge of the action function s. 
It can then easily be shown that these new coordinates and momenta 
satisfy a system of equations of the same form as (29 b), 

__ _ dQp _ dH' /n __ j n o\ /on 

an'* iT aw \P I»2,3), (31) 


t Instead of one particle one can consider a continuous assembly of its copies, 
distributed not in the ordinary space as before, but in the phase-space with a density 
depending in general upon the time. 
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where H' is the new Hamiltonian function which is obtained by re¬ 
placing in the original function H(Q,P) the old coordinates and 
momenta by the new, according to the formulae (30) and (30 a). The 
transformation defined by these formulae is called a ‘point transforma¬ 
tion’. As already mentioned, it is a special case of the canonical 
transformations. A canonical transformation (of the coordinates and 
momenta) is defined by the formulae 




(31a) 


where 0(Q,P') is a completely arbitrary function of the original co¬ 
ordinates and the new momenta. If, in particular, we put 


0=i 


we obtain, by (31 a), 

Q'p ~ fp(Qv # 2 > 6s)i 



which corresponds to the point transformation (30), (30 a). 

The fact that the original canonical equations (29) are transformed 
by (31a) into equations of the same canonical form (31) can be shown 
as follows: 

We form the complete differential or rather the variation of the 
function <I>, corresponding to a virtual variation (completely indepen¬ 
dent of the actual motion) of the variables Q, P': 



and differentiate this expression with regard to the time. We also take 
the time derivative of O 



and form its variation. By subtracting the expressions thus obtained, 
we get, remembering that 8 and dt are commutative, 



Now by (29 b) we have 
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Hence, in virtue of H(P, Q) = H'(P', Q '), 
we obtain 



Since the variations SQ# and SPp are arbitrary, we can equate their 
coefficients. In this way we get equations (31). 

Those canonical transformations, in which the transformed Hamil¬ 
tonian //' depends only on the momenta P' and not on the coordinates 
Q\ play a special role. Such coordinates are usually called cyclic. The 
equations (31) reduce in this case to 


Pp = const. 


dQ§ __ dH' 

dt dPp 


cop = const., 


i.e. Qp = (op t~\~(f>p. 

If the transformation function O leading to cyclic coordinates is 
known, the mechanical problem can be regarded as solved, for the 
original coordinates and momenta are then expressed according to the 
equations (31a) as functions of the time which, besides t, only contain 
constants Pp, cop , and cf>p. 

Now it follows from (31a) that this special transformation function 
is just the action function 8 regarded as a function of Q v Q 2 , Q 3 and of 
three arbitrary constants P[, P 2 , P.' which necessarily appear on solving 
the Hamilton-Jacobi equation (16) or (17) by which this function is 
defined. These constants of integration can be expressed in terms 
of the three principal moduli of periodicity of the action function 
J a = (A«s) a with regard to a system of separable coordinates q v q 2 , q z (which 
we need neither actually know nor consider in detail here). Replacing 
the original constants P^ by their expressions in terms of J v J 2 , J s we 
can write the transformation function O in the form s(x,y,z ; J 2 , J 3 ) 
and define the constants J a as the new momenta (P^ = J a ). Considered 
from this point of view these constants are called the ‘action variables’ 
of the problem. The corresponding cyclic coordinates are called the 
‘angle variables’. We shall denote them by wp (= Qp). 

We have therefore wp — <opt+<f>p , (32) 

where according to the transformed canonical equations (31) 


cop 


dJP 

dJo 


const. (H' — W) 


ds 

dQ« 


Wp ■ 


ds~ 

dJp 


(32 a) 
(32 b) 


and 
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To ascertain the dependence of the old coordinates Q a on the new 
coordinates wp, we shall introduce for a moment as an intermediate 
link between them the separation coordinates q lf q 2 , q s . Expressed as 
a function of the latter, the function s assumes the form 




To a cyclic alteration of the coordinate q a there corresponds by (32 b) 
an alteration of the coordinate wp by A^wp-- A ^dsJdJp. We have 
therefore, because A a sp — «/ a if a = /?, and -- 0 if a ^ /?, 


dJp (0 (a ^ P). 


These formulae show that when any angle variable wp is increased by 
1 and the remaining ic’s are maintained constant, which corresponds 
to the cyclic alteration of the separation coordinate qp , i.e. to the 
return of the particle to the original position along a ‘/?-curve’, then 
the action function 6’ increases exactly by Jp. 

From this it follows that the coordinates Qp , and consequently the 
momenta Pp, are periodic functions of the angle coordinates with periods 
equal to 1. Each of them, as well as any function f(Q v Q 2 , Q 3 ) (or still 
more generally f(Q, P )), can be expressed in the form of a triple Fourier 

senes f=I fk l ,k 2 ,k s e i2 * k '’ c ' +k ^ k ‘ w ‘\ (33) 


where k v k 2 , k z are integers which can assume all values from — oo to 
+oo, and fk v k v k % are certain expansion coefficients characteristic of 
the function /. If instead of the Wp we put their values obtained from 
(32), we get j = y (33 a) 

/Cj f Wjp | 

where the C k are new expansion coefficients which we can regard as 
the amplitudes of various harmonic vibrations, while 

w — k l w 1 -\-k 2 w 2 J rk 3 w z (33 b) 

are the frequencies of these vibrations. The quantities wp, i.e. the 
velocities corresponding to the angle coordinates, represent therefore 
the fundamental frequencies of the motion. 

We can now return to the problem of determining the time mean 
value of /. This problem can be solved at once by means of formula 
(33 a). Indeed, the required time mean value must obviously be equal 
to that amplitude coefficient in (33 a) for which the vibration frequency 
w vanishes—or the sum of such coefficients if the equation w = 0 is 
satisfied by several different combinations of the numbers k v k 2 , k 3 . 
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This mean value can be represented on the one hand by the general 
1 T 

formula / == lim — \ f dt. On the other hand it can be represented 
t—*>*> 1 j 
0 

just as well by the formula 

111 

/= JJJ / dw x dw 2 dw 3 (34) 

0 0 0 


which does not contain the time explicitly, the triple integration being 
extended over the ‘period cube’ in the coordinate space of the angle 
variables;/is given as a function of the angle variables by formula (33). 

The expression (34) has the form of a ‘statistical’ mean value corre¬ 
sponding to an averaging over the various copies of the given particle 
distributed with a constant density in the space of the angle coordinates 
w j, w 2 , w 3 . Its numerical agreement with the time mean value of / for 
a definite copy means that the curve described by the motion of such a 
copy fills up this space uniformly.| * 

We can now return from the angle coordinates to our original rect¬ 
angular coordinates Q x = x, Q 2 ~ y, Q 3 = 2. In view of the fact that 
the new momenta are constants, the old coordinates may be considered 
practically as functions of the new coordinates alone, and vice versa. 
We can thus transform the volume integral (34) according to the 
well-known theorem of Jacobi, and put 


where dV = dxdydz and 


/= 

jfDdV, 

dw x 

dw x 

dw 1 

dx 9 

dy’ 

dz 

8w 2 

du> 2 

dw 2 

dx ' 

dy' 

dz 

8w 3 

dw 3 

dw z 

dx ’ 

dy' 

Yz 


(34 a) 


D = 


By (32 b) this functional determinant can be written in the form 


D 


e*s 

d 2 s 

8*s 

8J y 8x' 

dJ\dy’ 

dJ x dz 

8*8 

8*8 

d 2 s 

dJ 3 dx' 

dJ 2 dy 9 

dJ 2 dz 

8 2 s 

d 2 s 

d*3 

dJ 3 Bx * 

dJ 3 dy' 

dJ z dz 


(34 b) 


f This condition is satisfied for non-degenerate motion, that is, motion for which the 
three fundamental frequencies «u t , co lf w 9 are not commensurable with each other. 
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The volume integration in (34 a) must be extended over the whole 
region for which W—U 0. We are thus brought to the conclusion 
that the relative probability that the particle will be found in the 
volume-element (2 F, as measured by the relative duration of its presence 
in this volume-element, is equal to D (J D dV — 1). Comparing this 
result with the wave-mechanical average 

we see that it will agree approximately with (34 a) if ipip* — D. Now 
in the region W— U ^ 0 the function s S° is real, so that the modulus 
of the function ip ~ Ae i27Tft l h — e i2rri5 o ih+s' m ust reduce to A —- e s \ It 
follows therefore that A2 n 


It should be remembered that an exact agreement between the classical 
and the wave-mechanical mean value is out of the question- not only 
because of the approximative character of the preceding expre. ^ion for 
ip (with s determined from the Hamilton-Jacobi equation), out also 
because in the wave-mechanical case the integration must be extended 
over all space including the classically forbidden region. However, this 
region, although infinite, contributes in general only a finite and usually 
a small amount to the integral J fipip* dV because of a very rapid 
decrease of the function \ip\ 2 . 

The relation A 2 = D can of course be derived in a straightforward 
way by integrating the equation 

divyl 2 Vs ~ 0 

[cf. (17 a)], or the equation 

V 2 S 0 +2VS 0 -VS' = 0 


to which (21b) is reduced in the case of conservative motion. This 
integration has been carried out (in the case of the second equation) 
by Van Vleck, who showed that A 2 must be proportional to the deter¬ 
minant 


d 2 s 

8 2 s 

d 2 s 

dxdoc 

dyda 

dzdoc 

d 2 s 

8 ! s 

d 2 8 

dxdft 

dydp 

dzdp 

d 2 s 

8h 

a 2 S 

dxdy 

dydy 

dzdy 


where a, /?, y are any three integration constants occurring in the 
expression of the function s{x i y i z\(x y ^y). This determinant is equal 
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to the product of D with the determinant which is a con- 

Vy'V * 4 ) 

stant factor playing the role of a normalization constant. 

In the special case of uni-dimensional motion the determinant (34 b) 
reduces to dhjdxdJ, whereas by direct integration we obtained, in this 


case, A 2 


Thus we must have 


----- = «iG 12 /— , 
dxdJ I dx 


that is, 



or since — (—\ = W— U, we get —AW—U) — C 2 . This condition 
2 m\dxj ’ 6 dr ' 

is actually fulfilled, for dUjdJ — 0 and dW jdJ ™ w — 1 /T, where T is 

the period of motion [according to (32 a) with \V IF J. Hence we 

get C 2 = l/T in accordance with the simple theory developed in the 

preceding section. 



II 

OPERATORS 


6. Operational Form of Schrodinger’s Equation, and Opera¬ 
tional Representation of Physical Quantities 

The formal relation between classical mechanics and wave mechanics 
can be presented in another way which not only leads us to a deeper 
understanding of the theory but also to various important generaliza¬ 
tions. 

We can arrive at this relation by examining SchrOdinger’s equation 
(12) written in the form 


Dip = 0 , 


where D denotes the operator 


D 


^±\( h ±Y+(Jl±Y+( h iVl 

2m [\27ti dx) \2t ri dyj ylrridz) 


| + A i+u. 

l ^2m dt 


This can be expressed in terms of the elementary differential operators 
h d h d h d h t) 


27 ri dx 2vi dy 2ni dz 2rri dt 


(35) 


by the formula 


D = 2^i^2+Pj+rf)+P/+^. 


(35 a) 


The equation Dip — 0 thus reduces to the classical equation 
T+U-W = 0 

if we replace the operators p x , p y , p z by the components of the momen¬ 
tum, and —p t by the total energy, i.e. if instead of (35) we put 

Px = 0x> Py = 9y> Pz = 9x< Pt = - W (36) 

and cancel the function ip (considering it as a factor). Therefore the 
transition from classical mechanics to wave mechanics can formally be 
carried out as follows. In the ‘classical’ equation 

~(9l+9l+9l)+U-W = 0, (36 a) 


which relates the components of the momentum and the total energy of 
a particle, we must replace these quantities by the elementary operators 
(35) and then multiply the SchrCdinger operator D thus obtained by 
the wave function ip on the right, where ‘right multiplication’ simply 
means applying the operator to the expression standing on its right. 

hr d 

The replacement of the energy W by the operator — p t = — —. — has 
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been made before, although in a somewhat different connexion, namely, 
in the transition from the wave equation 


vv+ 8 ^ 2 m (W-c/>A = o 


for a conservative motion to the general equation 

vy+-^-~.£-pV=°. 


w'hich applies to a motion of any kind. In the former case, since 
t/j s= t/j°(x , y, z)e~ i27rWtlh , the operator p t is actually equivalent to the 
energy in that it satisfies the equation p t \p = — Wip, which we could write 
symbolically (dropping the function operated upon) in the form p t — — W. 
A similar equivalence exists between the operators p x , p y , p z and the 
components of the momentum g xt g y , g z with respect to the wave 
function ^ __ CQng ^ e iZv(g x x+g v v+OtZ-WMh, 


representing the free motion of a particle with a velocity of specified 
magnitude and direction. As we know, the latter can be specified only 
in this particular case. In the general case the functions p x 0, p y ip, p z $, 
—p t i/j are not equal to the products of the function \p by constant 
numbers. 


It is natural to associate this result with the fact that, in the general 
case, the components g x> g yy g z of the momentum, as well as the energy 
W, cannot be defined as certain numbers since they do not have 
definite values, and to assume further that the operators p xi p yi p z , 
— p t by which they are replaced in the transition from classical to wave 
mechanics must replace them in all wave-mechanical questions. 

This principle is corroborated by the following considerations. 


(1) If the wave function ip can be approximated to by the expression 
e iiTrSih where g j 8 the classical ‘action’, i.e. the momentum-potential 
determined by the Hamilton-Jacobi equation, then we have 


Px4> = 


^_ e iiirSlh _ e «irS/ft£^ _ 

2t n dx dx 


etc., so that in this approximation the operators p xy p yi p z are actually 
equivalent to the components of the momentum g xi g yy g z . This result 
still holds approximately if ip is represented in the form Ae i2nBlh where 
8 is the classical momentum-potential, for the partial derivatives of the 
amplitude A with regard to x , y, z (so far as the above approximation 
can be applied) are very small compared with the partial derivatives 
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of s/h , i.e. the components of the wave number (the wave-length being 
supposed to be very small). 

(2) If the function ip is ‘quadratically integrable’, i.e. if it can be 
normalized in such a way that the integral J ipip* dV is equal to 1, then 
the integrals 

J t*p x f dV, J t'Pvt dV, J dV 

coincide with the average values of the components of the momentum 
as defined by the integrals 

m j Jx dV, mjj v dV, mjj,dV, 

where j = ipip*\ is the probability current density and v is the average 
velocity introduced in the preceding chapter, §§ 2 and 3. We have in 
fact, according to the definition of j x> 

"P*' F -sj 

Now by partial integration we get 

S '*%' dF =f 4 <«•>^ - J *’t ir - - S *'% dr ' 

since in order that J ipip* dV should have a finite value the function tfnft * 
must vanish at infinity rapidly enough to make the integral 

Jl W *)dF=|J [WZlSZdydz 

vanish too. Therefore 

m J j * dV = £ni J dV = J ****+' dV - 

The preceding results can be extended to the more complicated' 
operators, by which different classical quantities represented as certain 
functions of the coordinates and momenta F(x,y,z’,g xi g y ,g z ) must be 
replaced, when g XJ g y , g z are replaced by the operators p x , p yj p z . The 
simplest example of such a complicated operator is the operator 
T = (pl+pl+pl)/(2m) representing the kinetic energy. If the func¬ 
tion iff describes a motion with a given constant value of the total 
energy, i.e. if it satisfies the Schrftdinger equation (T+f7— W)iff = 0, 
then we have Tip — (W — U)ip, where the ‘operator* (W~U) is a simple 
factor. The preceding equation expresses the fact that the kinetic energy 
(i.e. the magnitude of the classical velocity) is a definite function of the 
coordinates. The sum of the operator T and the potential energy U 
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represents the total energy of the particle and is usually called the 
energy operator, or the Hamiltonian operator, or simply the ‘Hamil¬ 
tonian’. Denoting this operator by H, we can write the preceding 
equation in the form Hip = W\p. It expresses the fact that the energy 
of the particle in the motion described by the function \p has a definite 
value, namely, W . The general equation referring to a non-conservative 
motion can be written in the form 

(H+PtW = o. (37) 

It implies a certain relation between the two operators H and —p t , 
both of which represent the energy W (when it exists)—the former in 
a specific way, including the properties of the particle (mass) and the 
character of the field of force in which it moves, and the latter in a 
perfectly general way independent of these characteristics. 

Independently of the form of the operator F(x,y f z;p x ,p yi p e ), it can 
easily be shown that the result of applying it to the function ip ex¬ 
pressed in the approximate form e i2jrSIh (or Ae i2n8lh ) is equal approxi¬ 
mately to the product F(x,y,z;g x ,g v ,g s )*p. The same is true in the 
more general case of an operator containing the time t and the time 
derivative operator p t . We have namely 

F(x, y, z, t;p x ,p yy p z ,p t )ip = F(x , y,z,t\g x , g y , g z) — W)ip , 

if the energy W is defined as — dS/dt , in accordance with the Hamilton- 
Jacobi equation which gives — dS/dt = (V&) 2 /2m+f7 = T+U. The 
function Ftp resulting from the application of the operator F to the 
exact wave function ip can be represented as the product of the latter 
with a certain function F c of the coordinates alone (and eventually of 
the time). The function F c = (F*p)/ip can be defined as the value of the 
quantity represented by the operator F at the corresponding point (and 
instant of time). This is precisely the way in which we have defined 
above the value of the kinetic energy in the case of a conservative 
motion. If, in particular, the ratio (Fip)/*p is equal to a constant C , 
then the quantity represented by F is said to be a constant of the motion , 
its value C being independent of the position of the particle (and of 
the time). This case can be illustrated by applying the energy operator 
H to a function i p which describes a conservative motion, or by applying 
any one of the operators p x , p v , p z to the function \p which describes 
a uniform rectilinear motion. 

If the ratio F c = (Fip)jip is not equal to a constant, then we can 
define the average or probable value of the quantity represented by 
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the operator F by means of the formula 

F = J> c #* dV 

or F == J </,*F+ dV, (38) 

with the condition that f </«/>* dV = 1. (38 a) 


This definition of an average value is a generalization of that already 
considered in the preceding chapter in connexion with quantities de¬ 
pending on the coordinates alone (such as the potential energy). Its 
physical significance has been tested above in the case of the funda¬ 
mental operators p x , p y , p z . 

As a further illustration of the operational representation of physical 
quantities we shall consider the angular momentum of a particle, for 
instance, the angular momentum of an electron moving about a fixed 
nucleus (cf. Part I, § 14). In classical mechanics this quantity is defined 
as a vector with the components 


Wz—Wv *9x—x9»* X 9 v --yg x - 

We shall define it accordingly as a vector-operator M with the com¬ 
ponents 

M x = yPz- z Pv’ M V = z Px— x Pz’ M z = XPy-VPxy 



Transforming from rectangular coordinates to spherical coordinates by 
means of the formulae 

x — rsin0cos<£, y — r sin 9 sin <f>, z — rco&O, 


we get 


dip dip dx dip dy dip dz 
dr dx dr dy dr dz dr 9 


i.e. 


0 . , j d . . a . , d a d 

r— = rsinc/co8 0- \- r Bin asm6 -(- rcost/— 

dr dx dy dz 


and likewise 


d , d d 

==x di +y ^ +s Fz’ 


d_ 

d<P 


() 3 

-rsin0sin<£-- + rsin0cos<£ 

^ dx dy 

-tr _ h d 

z “ 2t ri dj>' 


c - y — 

oy 19 dx 


(39 a) 


We have therefore 



52 


OPERATORS 


§6 


Further, from (39) we get 
if 2 = Ml+Ml+Ml 


= - £[&+*)& - 2 ^ 2 i - 2 *i - -] 


4t7 2 [ 


( r 2_ a; 2) 


.2^4- -*». 

dydz dx 


where the terms denoted by ... are obtained from the given terms by 
cyclic permutation of the coordinates x , y , z. Because of the identity 

/ a . a , a \ 2 J 2 . . d , , 0 a 2 , 


2 s *, , <> a s , / e\ 2 a 

or x *^ + ... +2 yz^ + ... = \r-)-r- 

we can write the previous expression in the form 
M2 A 2 f 2 / , e2 \ / S\ 2 


Hence 


or putting 


4 it 2 [ \&r 2 ^ dz 2 / \ dr/ drj 

_A S Uv 2 -r 2 —-2r^|. 

47r 2 L c*r 2 dr J 

r 2 r 2 \ dr] 'r\dr) h 2 r 2 dr 2 ' r dr 


d 2 2d 1 

y2 ~jrz+-i+ 1 2 Q2 > 

dr 2 r dr r 2 


Q 2 = _L l(sintfA\ + _. 
sin0 00\ 50/ si 


sin0d0\ dO) sin 2 0 d<f> 2 
denotes the angular part of V 2 , we get 

M * = (39b) 

By applying this operator and the operator (39 a) to the functions 
t Pnim = Fni( r )Yidfi> <£)> which specify the stationary states of a hydrogen¬ 
like atom, we get 

m nim = WM*Y lm = -^F nl WY lm , 
and by the equation Q 2 Y lm ^l(l~l)Y lm = 0 we get 
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Since, further, the dependence of Y lni (6,<f>) upon <f> is expressed by the 
factor e im ^ y 

*Wn, m = <'0a) 

These relations show that the magnitude of the angular momentum as 
well as its direction are constants of the motion—just as in the classical 
theory of a particle moving in a central field of force. It should be 
mentioned that the character of the central field affects only the radial 
factor F nl (r) in the wave function the angular factor Y'M) 

being in all cases a spherical harmonic function. Therefore the above 
relations hold for the motion of a particle not only in a Coulomb field 
but in any central field of force. They show further that the quantum 
numbers l and m which have been introduced in Part I, § 14, as 
nodal numbers, characterizing the wave function ifj nhn from a purely 
geometrical point of view, have also a dynamical meaning, one of them 
(/) determining the total magnitude of the angular momentum according 
to the relation M 2 = l{l+\)h 2 j^rr 2 y and the other (m) determining the 
projection of the angular momentum upon the 2 -axis according to 
M z = mA/27T. For this reason the numbers l and m will be called re¬ 
spectively the angular and the axial quantum numbers.! The constancy 
of the direction of the angular momentum is only proved indirectly by 
the relation (40 a) because the direction of the 2 -axis can be chosen 
arbitrarily, the functions ifj nlm being so defined that the 2 -axis is the 
axis of the spherical harmonic functions Y lm (6 , (f>) — P lni (6)e im 4. If we 
apply the operators M x and M y to these functions the result will not 
be similar to that obtained by applying the operator M z because the 
functions M x ip nlm and M y are not equal to multiples of t p nlm . Since 
we know that M x and M y also represent constants of the motion, we 
see that the condition Ftp — const, t/j cannot be regarded as the general 
criterion for the constancy of the quantity represented by the operator 
F. It can easily be shown that the above failure of this equation to 
express the general condition of dynamical constancy is connected with 
degeneracy , i.e. with the fact that the functions are not determined 
by the value of the energy W n which, in fact, depends only on the 
‘principal 5 quantum number (n). Any linear combination of the n 2 
functions ip nlm , which differ from one another by the values assigned 
to the numbers l and m, will also represent a stationary state belonging 
to the same value of the energy. This linear combination, i.e. the 

t This seems preferable to the traditional denomination where l is referred to as the 
‘azimuthal* quantum number and m as the ‘magnetic’ quantum number. 
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coefficients C lm in the sum Y 2 6/m */w> can be 80 chosen that the 

l m 

resulting function *//„ will represent the same thing with respect to the 
x-axis as t// nVm . with respect to the z-axis. Applied to this function the 
operator M x would be equivalent to multiplication by m!hj2rr accord¬ 
ing to the equation M x ip' n = (hm' /27r)i/r^ which could be considered 
as a direct expression of the constancy of M x . The function obtained 
by applying M x to ip ninl can easily be shown to reduce to a linear com- 

+i 

bination £ C m > ip nIm > of the 2Z+1 functions ip nlm associated with the 

m'~—l 

z- axis. 


7. Characteristic Functions and Values of Operators; Opera¬ 
tional Equations; Constants of the Motion 

In general the equation Fip = const, ip can only be satisfied by functions 
ift of a special type which depend upon the nature of the operator F 
and are therefore called the characteristic functions of this operator 
(‘Eigenfiinktionen’ of the German authors—often translated into 
English as ‘proper functions’). The corresponding values of the constant 
factor are called the characteristic values of F. As an example we may 
take SchrOdinger’s equation Hip ~ IVip. In this equation the wave 
functions describing the stationary states of motion are the charac¬ 
teristic functions of the energy operator H, and the eqergy-levels 
W are its characteristic values. In the case of //, as well as in the 
case of any other operator, these values and the functions associated 
with them can form both a discrete and a continuous set. The 
characteristic functions are fully determined by an operator F for a 
one-dimensional problem, involving one coordinate only. In three- 
dimensional problems there remains in general a certain ambiguity 
in the choice of the functions ip, as determined by a single equa¬ 
tion of the type Fip = const, ip, an ambiguity which is known as ‘de¬ 
generacy’ if F is the energy operator H. Thus, for example, the operator 


__ h d~ 
2ni dtp 


specifies the corresponding characteristic functions only 


with regard to their dependence upon m, defining them as ip — f(r , d)e im i 
where /(r, 6) is an arbitrary function of r and 6 . The operator M 2 like¬ 
wise determines the dependence of the characteristic functions on the 
angles 0, <p only, the equation M 2 ip = const, ip being satisfied by 
ip — /(r)F;(0, <p ) where f(r) is an arbitrary function of r, and Y t (d } <p) is an 
arbitrary spherical harmonic of order l , which can be expressed as a sum 
of 2Z+1 functions of the type P lm (6)e im i with arbitrary coefficients. 
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Now we have also seen that SchrOdinger’s equation Hip = const. ip in 
the case of a hydrogen-like atom has for each characteristic value of 
H = W n a solution of the form ip n = f n (r)Y(6 ,</>), where 7(0, <£) is a sum 
of n 2 spherical harmonic functions of the type P lm (6)e im $ with arbitrary 
coefficients (l — 0 , 1 , 1 ; m = —l, ..., -f Z). We cannot therefore 

completely specify the functions t p nlm describing the stationary states 
of a hydrogen atom by taking one of the three equations 

Hip = const, ip, M 2 ip — const, ip, M z ip = const, ip, (41) 
but only by taking all three equations together. The functions ip rdm 
then appear as the ‘simultaneous characteristic* functions’ of the 
operators H , M 2 , and M z , each of these functions belonging to a ‘triplet’ 
of characteristic values W n , ( M 2 ) t = 1(1+ 1 )Jl 2 + 7 t 2 , and (M z ) m = mh/2 tt. 

Another simple example of this relationship is provided by the 
operators p x , p yy p z . The characteristic functions of these operators are 
obviously /i(y,z)e f2wfcx/ \ f 2 (z 9 x)e i *™»vl\ f z (x,y)e ilvg ^ h \ f v / 2 , / 3 being 
arbitrary functions of the corresponding arguments. Taken together 
the three equations 

Px'l> = 9x'p, Pv'l> = 9 v 'l>. Pz'l‘ = 9z'l >. ( 41a ) 

where g x , g y} g z are constants, specify unambiguously the function 

ip = const. e itir to* x+0 » v+ff * s)lh 9 (41 b) 

which describes the uniform rectilinear motion of a particle with the 
momentum components g x , g y , g zy and which is a particular solution of 
SchrOdinger’s equation Hip — Wip with II — (pi+pl+pl)/2m, i.e. with 
U == 0 , corresponding to free motion. 

It should be mentioned that the expression (41 b) for ip is still incom¬ 
plete (as well as the expression ip = f n (r)Y lm (6, (p) for the hydrogen-like 
atom functions) inasmuch as it does not contain the time . The latter 
can be introduced by the additional relation 

—Pit — 

giving \p ~ e -i 2 nWtih' The constant W is, however, not independent , but 
is connected with g x , g yy g z by the relation W = ( 0 j+ 0 y+g;:)/ 2 m. 

If JP is an ordinary function of the coordinates (or of the time too) 
which does not contain the elementary differential operators p x , p yy p si 
then the equation Fip = const, ip has no solutions of the ordinary con¬ 
tinuous type. The only possible solutions—except the trivial one ip = 0 
—are those for which the function ip is different from zero on the surface 
F = const, and vanishes outside this surface (which can be displaced 
by varying arbitrarily the value of the constant). 
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Another interesting case is provided by operators which satisfy 
the equation Ftp = Cip identically , i.e. irrespective of the choice of 
the function */r, and therefore do not determine this function at all. 
F = p x x—xp x is the simplest example of such an operator. Applying 
it to some function t/>, we get 


Fiji = 


h 

2771 





Thus we see that this operator has one single characteristic value 
C — hj2ni with which any function can be associated as a Charac¬ 
teristic function’. The preceding equation can be written symbolically 

in the form \ 

p x x-xp x =--, (42) 


which is obtained by omitting the arbitrary function ip to which the 
left- and right-hand sides of this equation must be applied. We have, 
of course, similar equations for the two other coordinates and the corre¬ 
sponding components of the momentum-operator: p v y—yp v = h/2ni 
and p z z—zp z — hj2rti. In addition we have the ‘operational’ equations 
PxV-VPx = 0 or PxV = yPx* e ^ c -> which express the fact that the order 
in which the operators p x and y are applied to any function ip(x, y, z) 
is immaterial (since x and y are independent variables). The equations 
PxPy—PyPx — 0 are quite similar to the equations xy—yx — 0 express¬ 
ing the commutative law of ordinary multiplication. Two operators 
F and 0 which, when applied successively in the order F , G to any 
function ip give the same result as when applied in the opposite order 
G, F, are said to be commutable. This property is expressed symbolically 
by the operational equation 

FG = GF, (42 a) 

which means that the ordinary equation 

FGxP = GFip 

is satisfied identically , i.e. for any function \p. 

In general, the fact that the equation A\p — Bip is satisfied identically 
with respect to the function ip, A and B being two outwardly different 
operators, is expressed symbolically by the equation A — B. We shall 
now give a few examples of such operational equations. 

Let us consider first of all the operator F = p x f—fp x where f(x, y, z) 
is an arbitrary (continuous) function of the coordinates. Applying it to 
an arbitrary function ip, we get 



h df. 

£5 a**’ 
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SO that 


p f—fp = ]L -l 
2xJ JPx 2 7 ridx 


which means that the operator p x f—fp x is equivalent to the multiplier 
h df 

2nidx ’ 

The preceding equation is often written in the form 

| = («a) 

where the bracket expression on the right side is defined by 

bxJ] = 2 ”'(pJ-fp x ). (43 b) 

If, in the above definition of F, we replace / by x and p T by j) x [which 
means differentiation of the rath order with regard to x, combined with 
a multiplication by (h/27ri) n ] y we get 

F *= = L n -v™ 


so that 


p x x—xp x — 


which can be rewritten symbolically in the form 

n Ti ^ ^ « 

Xpr — p. X = — - P r • 

lx 2t ridp x 

This formula can easily be generalized for any operator expressible as 
the sum of terms a n p^ with coefficients a n which do not depend upon 
the coordinate x. Denoting this operator by f(p x *P u ,P z \y,z), we get 

xf—fx = — %-, (44 a) 

2tti dp x 

an equation very similar to (43) with x playing the role of — and 
p x the role of x. Putting 

0>/] = 2 j{xf-fz) (44 b) 

we can consider the equation 

£~f~ [ xJ] (44c ' 

as the general definition of the operator d/dp x . We shall write in general 
[F, <?] = (FG- GF), (45) 

ft 

this ‘bracket expression* introduced by Dirac as the quantum analogue 
of the Poisson brackets vanishing if the operators F and Q commute 
with one another. 


i 
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It should be noticed that an operational equation A = B expresses 
the identity of the physical quantities represented by the operators 
A and B; the existence of such equations indicates that the same 
physical quantity can be represented in wave mechanics in a number 
of apparently different ways. 

Another interesting and important illustration of operational equa¬ 
tions is provided by the representation of the angular momentum of 
a particle. 

From the definition (39) it follows that 

Ml = (yp z -zp v ) 2 = {yp z )*~(yp z ){zp u )-(zp u ){yp.)+{zp y )- 
= y~Pl + Z-J>1 — yp u p. z - zp, P y y, 

since p y commutes with z and p z , and p s commutes with y and p y . Taking 
into account the relations j) z z = zp z -\-hj2Tri and p y y ypy+h^lrri, 
we get } 

Ml = yVi+z-pl-‘~yzp u p --- 0 -(yPu+zPz), 

JLTTl 

whence the formula (39 b) can easily be obtained. We have in addition 
M x M y = {yp z —zp v )(zp x —xp.) = yp. zp x —zp u zp x -yp z xp z -\- zp u xp, 

= vPx Pz z-z*p u Px—yxpl + zxp v p z , 

whence 

M x M y -M y M x = yp x P x z+zxp y p 2 —xp v p, z-zyp x p z , 

= {yPx-xr u ){p z z-zp z ) = g h ni (yPx-zp v ) = - 
Thus, according to (45), 

[M x ,M y \=-M s . (45a) 

In a similar way we can derive the relations [M y ,M z ] = —M x and 
[ M Z ,M X ] — — M yy which can also be obtained from (45 a) by a cyclic 
permutation of the indices x, y, z. These three relations can be replaced 
by the symbolic vector equation 

MxM = -Am, (45b) 

2tti 

where A X B is defined in the usual way as the vector product of A 
and B. 

Interesting results are obtained by calculating the bracket expres¬ 
sions for the components of the vector M on the one hand, and the 
components of the vector r (x,y,z) or p(p x ,p y ,p z ) on the other. We 
shall not go into these calculations (which can easily be carried out by 
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the reader) but shall merely notice the following results: 

[p 2 , M] = 0, [p 2 , M 2 ] = 0, (f 6) 

where p 2 = pj+pj+p;:, the °f these equations being equivalent 
to the three equations [p 2 , M x ] = 0, [p 2 , if^] = 0, [ p 2 , JfJ = 0. Tlie.se 
equations express the fact that the angular momentum of a particle 
commutes with its kinetic energy T ==■ p 2 j2m (more exactly we should 
speak of the operators representing the angular momentum and the 
kinetic energy). If the potential energy U is a function of the distance 
r = J{x 2 -\-y 2 +z 2 } alone (which corresponds to a central field of force), 
then we also have 

[U, M] — 0, [U 3 M 2 ] -0, (46a) 

and consequently 

[//, M] = 0, [H, M 2 ] - 0, (46 b) 

where H = p 2 /2?n+U is the Hamiltonian operator representing the 
total energy of the particle. 

The relations (46 b) can be obtained very simply by using polar 
coordinates to represent II and M. Then 



and so 

[H,M.} = 1 ( h V-4 

L ’ zi 2m\2ni) r 2 

both bracket expressions [H 2 , d/dtp] and [II 2 , fl 2 ] obviously vanishing.! 

The equations (46 b) must be naturally related to the fact that M 
and M 2 represent quantities which are constants of the motion (in the 
case of a radially symmetrical field of force). An equation of the type 

[H,F] = 0, (47) 

i.e. the commutability of an operator F with the energy operator H, 
can actually be considered as the most general expression of the fact that 
F represents a constant of the motion determined by the operator H , 
i.e. by SchrOdinger’s equation Hip = Wip. 

In fact, applying the operator F to both sides of this equation, we 
have Flhp = WFip or, if HF = FH> we obtain H(Fifj) = W(F$). This 
shows that the function Ftp satisfies the same equation as the function 


n 2 ,7 


& 4 >\ 




f In order to obtain (46 a) without the use of polar coordinates we need only notice 

BIJ 8U 

that [U, MJ = [U, yp,-zpl\ = y[V , p,]- 2 [U, p,] = z— - y— according to (43a). 
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tjj with the same characteristic value of the energy operator //. If there 
is no degeneracy, i.e. if there is but one function ip associated with the 
characteristic value W, then Ftp can differ from \ft by a constant factor 
only (which is immaterial so far as the equation Hip = Wip is con¬ 
cerned). Thus in this case we get Fip ~ const, i//, which is the original 
condition for the constancy of the quantity represented by F in the 
motion described by t p. In the general case, i.e. when there is de¬ 
generacy, the function Fip must obviously be equal to a linear com¬ 
bination of all the functions ip v «/r 2 ,..., ip r associated with the same 
characteristic value of H , i.e. satisfying the equation Hip k — Wip k 
(k = 1, 2, ...,r), with the same value of the energy. Applying F to one 
of these functions we thus get, if FH = HF, 

Ft i>k = ( 47a ) 

i i 

where c kl are constant numbers, the matrix 



replacing the single constant C of the non-degenerate case. 

The fact that the equations (47 a) actually express the constancy of 
F can be proved by reducing them to a system of the standard form 

F 4>n — <•/>»> (47 b) 

where ip„ (n = 1,2, ...,r) are a set of r new characteristic functions of 
H belonging to the same energy-level W as the original functions 
ipi,...,ip r and therefore equal to certain linear combinations of the latter. 
In order to determine them, we shall first consider the inverse trans¬ 
formation, i.e. we shall express the original functions as linear com¬ 
binations of the new ones by means of the formulae 

'Pk=2, a kn'l‘h- (48) 

n-i 

If these expressions are substituted in equations (47 a), then, in con¬ 
junction with (47 b), we get 

2 a kn c n 0n = 2 2 C kl a ln trr 
n In 

Equating the coefficients of the same ip’ n and dropping the index n t 
we get r 

2>««< = c'a* (&= l,2,...,r). (48a) 

J = 1 
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This is a system of r linear homogeneous equations for the determine 
tion both of the transformation coefficients a and of the characteristic 
values c'. The compatibility condition for equations (48 a) 


Cu — c', 

c 12 

* . c lr 



C 21> 

c 22 —c' . . 

C 2r 

= 0 

(48 b) 


c r2 

■ ■ Crr- c/ 




gives r (in general different) values for the unknown c', and to each 
of these values c' t there belongs a definite set of coefficients a k , namely, 
a in> a 2n*—» a rn- By solving equations (48) with respect to the we 
can obtain the explicit expressions for the new functions in terms of 
the original ones. 

Summing up the preceding results, we can say that the condition 
[ H , J^J = 0 expresses the constancy of F with respect to all such types 
of motion as are described by functions 0 satisfying simultaneously 
the equations 7/0 — const, 0 and Fift - const, 0. The functions »A are 
thus simultaneously the characteristic functions of both II and * T 

So far we have regarded the energy as the queen of all the ope,. • • >rs, 
but the above considerations seem to banish the energy from this 
supreme position and to reduce the Schrodinger equation 7/0 — const, 0 
to the same humble role as that of any other equation Ftp =■ const, 0 
for the characteristic functions and values of any other operator F. 
Provided the operator F has a dynamical meaning, its cb aracteristic 
functions will describe the motion just as well as the Schrodinger wave 
functions although perhaps less completely and from a different point 
of view. The product 00* will represent the probability of finding the 
particle in the volume-element dV even if 0 is a characteristic function 
of some operator F different from the energy without being simul¬ 
taneously a characteristic function of the latter. The above-mentioned 
difference in the point of view is obviously as follows: if 0 is the charac¬ 
teristic function of Schrodinger’s wave equation, then 00* dV measures 
the probability of finding the particle in the volume-element dV with 
a specified energy W (the characteristic value of II associated with 0); 
if 0 is the characteristic function of some other operator F, then 00* dV 
measures the probability of finding the particle in the volume-element 
dV with a specified value of the quantity represented by F. 

The fact that the probability determined by some ‘wave function * 
0 has a conditional character only, dependent upon the assumption of 
a certain specified value for the quantity or quantities by which (or 
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rather by whose operators) the function ip is characterized, is of funda¬ 
mental importance for a deeper understanding and further development 
of wave-mechanical theory. We shall not stress this further here, but 
shall limit ourselves to the following remarks. 

(1) In the case of a one-dimensional motion the SchrGdinger wave 
functions are completely determined by one operator only, namely, the 
energy operator H. This means that the energy is the only independent 
constant of the motion, i.e. that any other operator F commuting with 
II represents simply a function of II. A function of this kind can be 
defined by the fact that its characteristic values are a definite function 
of the characteristic values of H. If, for instance, Hip = Wip, then 

H 2 ift = H(Hi/j) = IIW ip = WHift = W 2 </>, H n t/j = W 

and in general F(H)ip — F(W)ip, (49) 

a result which can be proved directly if F is represented by a power 
series in H with constant coefficients and which can be used as a defini¬ 
tion of F(II) in the general case. The wave functions describing the 
motion of a particle in three dimensions are completely determined not 
by the energy operator alone, but by three independent mutually com¬ 
muting operators which represent three constants of the motion—if one 
of them is the energy, or if they indirectly involve the energy, all the 
three commuting with the latter—such that their common characteristic 
functions are at the same time solutions of the SchrOdinger equation 
Hifj = Wi/j. 

(2) If the function ip does not satisfy this equation, then it does not 
describe the motion, and the operator or operators by which it is defined 
(according to the equations Ftp — const. \p) can be said to have specified 
values, but not constant values , i.e. values which are not permanent in time . 
Thus time appears as the correlate of energy—a fact which is obvious 
in view of the possibility of representing the energy not only by the 
Hamiltonian operator H, but also by the time derivative operator 

h d 

— p t = ——. —, the general form of the SchrOdinger equation (H+p t )ip = 0 

2m ot 

merely expressing the equivalence of the two representations with 
respect to a certain set of functions. 

8. Probable Values of Physical Quantities and their Change with 

the Time 

In classical mechanics time enjoys a supreme role entirely different 
from all the other variables, being actually the only independent 
variable. The main problem of mechanics is to determine how all the 
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other variables—in particular the coordinates—change with the time. 
In wave mechanics the time seems, at first sight, to be reduced to 
a humbler role, since the spatial coordinates no longer depend on the 
time but are treated—so far as the wave-mechanical ‘equation of 
motion’ is concerned—as independent variables, that is, they appear 
on the same footing as the time itself. 

This equivalence between the spatial coordinates and the time is 
restricted, however, as we know, to the wave equation (//+p,)^ = 0 
and does not extend to the boundary conditions under which it has to 
be solved nor to the interpretation of its solutions. Thus a function 
ifj(x,y,z } t) which satisfies the preceding equation is interpreted as the 
measure of the probability of finding the particle under consideration 
in a volume-element dV = dxdydz at a definite instant of time , the 
probability in question being defined as equal or proportional to i/jip* dV. 
If time played the same role as the coordinates, we should not be able 
to refer the probability to a definite instant of time but should instead 
refer it to an interval of time dt, and define it as proportional to ip^dVdt. 
There is, however, actually no reason why we should not be able to 
refer the probability of location to a given instant of time—for the 
particle must be somewhere at any moment. The exceptional role of 
the time becomes particularly clear if we restrict ourselves to solutions 
of the SchrOdinger equation which vanish at infinite distance (they 
cannot vanish for t ~ ±00 except in separate places!) in such a way 
as to ensure the convergence of the integral J 1 pip* dV extended over 
all space. Taking the time derivative of this integral and replacing 

h 

d(W*)/dt by —div j , where j = - . is the probability 

current density, then, if the integration is first extended over a finite 
volume limited by a closed surface, we get 

!Jw*<ZF= -j>j n ds, (50) 

where J n is the normal component of j. When the surface S is removed 
to infinity the latter integral tends to zero (so long as *ft is supposed to 
be quadratically integrable), so that in the limit we get 

j ifjifj* dV = const., 

00 

which enables one to normalize ^ to 1 by the condition 

J <fnf,* dV = 1 . (50a) 
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It should be remarked that this result holds for the motion of the 
particle not only in a constant field of force (this case has been con¬ 
sidered in § 17, Part I), but also in a variable field of force. 

Now if J tptp* dV is constant, it is futile to consider the integral 
JJ tptp* dVdt with a view to normalizing the function tp in such a way 
that the time would appear on the same footing as the coordinates. 
The Hamiltonian operator H , which, as we have seen, is intimately 
connected with the time, must therefore play an exceptional role in 
determining the permanence or non-permanence in time of different 
quantities connected with the motion. 

As has been shown before, this permanence is determined by the con¬ 
dition HF—FH = 0, where F is the operator representing the quantity 
in question. We are now going to generalize this result for quantities 
which are not constants of the motion, i.e. quantities for which the con¬ 
dition HF—FH == 0 is not fulfilled. 

In classical mechanics such quantities can be determined as functions 
of the time. In wave mechanics such a determination is only possible 
for their probable values, as defined by 

F = J <fi*F>/> dV, 

under the condition (50 a) (which is fulfilled for a motion restricted to 
a finite region or represented by a wave packet). 

Differentiating F with regard to the time, and taking into account 

the equations i)* = 0, (h~£. |)** = 0, we get 

j [(H+*)(Ft)-t*F(H+)]dV. 

Now it can easily be proved that 

J (H>p*)(F$) dV = j f*H(F4 <) dV. 


In fact, putting Ftp = f v ip* = / 2 , and writing the operator H in the 


form 
we find 




J j [4(4/,-/■!/.)+ 

- L(M S 6iT,adV ’ 
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=/iV/ 2 -/ 2 VA. 

If the integral J / x / 2 rfF 

00 

is convergent, then the integral J div f 12 dV = J / 12n must vanish 
when the integration is extended over all space (the surface S receding 
to infinity), so that we get 

!fxBft*r ~ j f'HAdV. (51) 

It should be mentioned that all operators having the property expressed 
by this equation are called ‘self-adjoint’. Strictly speaking, the self- 
adjointness of an operator H is expressed by the fact that the 
difference / x IIf 2 —/ 2 Iij\ is equal to the divergence of some vector; this 
condition leads to (51) when combined with the condition 

J fi U= finite. (51 a) 

The latter condition is certainly fulfilled for / x — Fi/j and / 2 = «//* so 
long as (50 a) is fulfilled. 

We thus can rewrite the above expression for dFjdt in the form 

2 = ~ J [rum-rwmdv, 

or d ~~ = ~ [ 4>*(HF-FH)4,dV. (52) 

at h J 

It follows from this formula that dFfdt = 0, which means that F is 
a constant of the motion, if HF = FH. This agrees with the result 
found before. According to the general definition of the probable value 
of a quantity represented by some operator F, we can define the right- 
hand side of (52) as the average value of the operator 

2 ™(HF-FI1) = [ H, F], 

J jji _ 

Therefore = [#, F], 

or ^l = [H,F], (52 a) 

if dF/dt is regarded as an operator defined by equation (52 a) and satis¬ 
fying the condition ^ ^ _ 

dt dt 
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In the derivation of (52 a) we have tacitly assumed that F did not 
contain the time explicitly. If it does contain the time, then equation 
(52 a) must be replaced by 


dF 8F J-W m 

di = U +[H ' F] ' 


(52 b) 


For example, let us put F = x. The time derivative of x as a quantity 
is equal to zero, since x is independent of t. Regarding x, or rather 
dxjdty as an operator, however, we have 

% = [//,*] = -[*,//], 


or according to (44 c) 


dx dH 
fit — dp x ’ 


which. * ^th 


a = 5 -(pi+pl+p'i)+ u ( x >y> z )> 


(53) 


gives 


dx __ 1 
dt mP x 


(53 a) 


This equation coincides superficially with the classical relation between 
velocity and momentum, considered as definite quantities . In wave 
mechanics, however, they are indefinite quantities represented by the 
operators dr/dt and p = mdr/dt. Putting F = p x , we have 

d {'^[H, Pl ] = lu,p x ]=-[r x ,u) 


or, according to (43 a), 


dt 


dH _ _8U 
dx dx 


(53 b) 


Equations (53) and (53 b), together with the corresponding equations 
for the y and z components, are formally identical with the classical 
equations of motion in the ‘canonical’ form (see preceding chapter, § 5). 
If the classical quantity represented by the operator F is defined as a 
function of the time and of the (classical) variables x , p x \ y, p y \ z , p c , 
we have 


dF__&F V l d Z^4- dFd P*\ , \ftdHdF __dH 3F\ 
dt dt dt dp x dt) dt f^\dp x dx dx dp x ) ° 


according to (53) and (53 b). Comparing this with (52 b) we see that 
the classical analogue of the quantum bracket expression [H, F] is the 


sum 



a H dF 
dp x dx 


dH dF\ 
dx dpj 


which is the classical Poisson bracket ex¬ 


pression. 
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Equation (52 a) looks very similar to equation (43) and the equations 
corresponding to the other two coordinates, namely, 

I = ["-/)• i-bv/J, 5 = fe./l («) 

the time t being related to the energy operator H in the same way as the 
coordinates x , y , 2 are related to the operators p x , p y , p z representing 
the components of momentum. This relationship seems very natural 
from the point of view of the relativity theory and seems to indicate 
that time and energy must be treated on the same footing as the spatial 
coordinates and the components of the momentum. The similarity 
between the relations dFjdt = [//, Z' 1 ] and dfjdx = [p x ,f] is, however, 
only apparent—for in the latter case / denotes a function or operator 
depending explicitly upon x , and d/dx denotes partial differentiation 
with regard to x, while in the former ease F is a function or operator 
which does not contain t explicitly. The time equivalent of equations 
(54) is easily seen to be 

( r '4a) 

This equation follows immediately from the definition of the operator 
Pt = Replacing 8F/SI in (52b) by we get 

d f t =[(B+p,),F\. (54 b) 

Tt should be noticed that the operator H-\-p t does not vanish identically, 
as might appear from the equation (H-\~p t )ip — 0 , but only with respect 
to the functions defined by this equation and describing the general 
type of motion determined by the Hamiltonian //. The fact that there 
are actually two different operators H and — p t representing the same 
quantity, i.e. the energy, and equivalent to one another with respect 
to the wave functions describing the motion of the particle, suggests 
the possibility of restoring the symmetry between time and space which 
is required by the relativity theory by introducing certain operators 
G x , G ir G z which, though entirely different from p x , p y , p z> w r ould repre¬ 
sent the same thing as the latter, i.e. the components of the momentum. 
The operators G would have to be defined so as to be equivalent to 
the corresponding p with respect to the same wave functions as the 
operators H and — p t . If this were possible, we could replace the time 
in its exceptional role by any one of the three coordinates x , y , z> 
e.g. we could define the wave functions by an equation of the type 
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(G x —p x )tp = 0, and interpret ipijj* dydzdt as the probability of finding 
the particle in the region specified by dy , dz, and dt for a definite value 
of its ^-coordinate. We could further define the average or probable 
value of an operator by the formula F = JJJ ip*F\p dydzdt as a definite 
function of x and obtain for its derivative with respect to x an expres¬ 
sion similar to (52) or (52 b), i.e. 

provided the operator O x were self-adjoint, in the same sense as H. 

This relativistic symmetry between space and time, as expressed by 
the equal eligibility of any one of the four quantities x , y, z, t, and the 
associated quantities G x , G y> G z , H to the presidential role which has 
hitherto been enjoyed only by t and H , cannot, however, be attained 
if we retain the definition of the Hamiltonian operator 

11 = 2 

which has so far been used and which corresponds to pre-relativistic 
classical mechanics. This follows from the unsymmetrical way in 
which the operators p x , p yi p z , and p t are involved in the equation 
{H+p t )ip = 0 . 

It is possible, however, to modify the Schrodinger equation so as to 
secure the desired symmetry enabling one to formulate it in either of 
the four equivalent ways ( G x —p x )ip = 0, (G y —p y )ip = 0, (G z —p z )ip = 0, 
(H~\-p t )ip — 0 in agreement with the relativity theory. This modifica¬ 
tion (due to Dirac) will be considered later (Chap. VI). 


9. The Variational Form of the Schrodinger Equation and its 
Application to the Perturbation Theory 

If the potential energy U does not involve the time explicitly, then 
the equation (H-\-p t )i/j has, as we know, particular solutions of the type 
tfj — *p°{x, y , z)e~ i2nWl!h , where the ‘amplitude’ function *fj°(x, y t z) satisfies 
the equation Hip 0 — Wip° (which has been written before in the equi¬ 
valent form Hip = Wip). Multiplying it by ip°* and integrating over the 
whole space, then if, as we shall assume in future, j ip°*\p° dV = 1, 

We gCt j dV = W. (55) 

This is just what we should expect, since, according to the general 
definition of probable (average) values, the integral 

J dV = J dV = H 
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is the probable value W of the energy which is a constant of the motion. 
We shall now show that the function t/f°, which may be called the 
characteristic function of the operator H (the time factor being 
irrelevant so far as the equation JJtp = Wi/j is concerned), can be deter¬ 
mined from the variational principle 

8H =3 S J 0 dV = 0, (55 a) 

in conjunction with the normalization condition 

f#*dF= 1. (55 b) 

We have in fact 

8/7 = J 8l/l 0 *//(/r 0 dV -I- J ^*118^ dV, 

or, according to (51), i.e. because of the self-ad jointness of II and 
because of the convergence of the integral J t dV , 

hH = J dV + J 8ip°IJip 0 * i IV. (56) 

Further, (55 b) gives 

J <iF + J 8 </.<y.»* dV = 0 . (56 a) 

So long as the function if/ 0 is looked for as a complex quantity, it is 
equivalent to two real functions. We could therefore consider i/*° and 
i/j°* as two independent unknown functions, and treat their variations 
as arbitrary independent infinitesimal quantities, were it not for the 
condition (50 a). According to the Lagrange ‘method of multipliers’, 
this dependence can be removed by multiplying (5G a) by some constant 
factor C and subtracting the result from (50). This gives 

J 8>jt 0 *(Hiji 0 —Cip 0 ) dV + | 8 l /.°(/7«/.o*-C'f>*) dV = 0, 

and since S^°* and 8 t/»° can now be regarded as completely arbitrary, 
we must have //^o = < 7^0 and r= ctyo* 

Thus from (55 a) and (55 b) we have obtained the Schrodinger equation 
for the function 0° and its conjugate complex function. The energy W 
appears in the variational method as the value of Lagrange’s multiplier 
associated with the function */*°, and the Schrodinger equation appears 
as the variational equation of Euler and Lagrange corresponding to the 
‘conditional extremum’ of the integral 11 — j* dV . This integral 

can be written in a somewhat different form—a form which contains 
only the first derivatives of the functions ip° and ip°* (as it must do if 
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the variational equation is of the second order). We have in fact 

*•* r * 8 *°\ _ e f* 8 f, 

dx iY dx\ dx J dx dx 

and consequently 

J dV 

= ±( A)*[ J div(</i°*Vi/i 0 ) dV - J V</r°*V^t° dFj + J dV, 

or, since the first integral in the square brackets vanishes, 

/* 2 


M( 


87r 2 m 


V^°*V^°+?70 O V> 


.°) dr. 


(57) 


Putting p = we can rewrite this expression in the form 


// = J(^wt+^tW, 


(57 a) 


where |p0°| 2 is the scalar product of the vector p</»° and the conjugate 

complex vector p*ip°* = — . V0°*. If, in addition, we introduce the 

2m 

function S ~ ^ .log^°, and so replace pi (j q by 0°V$, we get 
2m 


H = U^lVSf+U^rfdV. (57 b) 

The integrand of this expression looks exactly like the classical expres¬ 
sion for the total energy (S 0 being the Hamilton-Jacobi action function) 
multiplied by |i/t°| 2 . It is worthy of remark that SchrOdinger first 
obtained his wave equation by applying the variation principle to the 
integral (57 b), without fully realizing at that time (beginning of 1920) 
its physical meaning. 

The variational equation 5 H — 0 does not mean that the values of 
H — W obtained from it (with the condition J dV = 1) are 
minimum or maximum values compared with those corresponding to 
slightly varied functions ip°. In order to find out whether we actually 
have an extremum or only a stationary value, we must calculate the 
variation of 11 to the second approximation, i.e. to the second order of 
the small quantities 8^° and 
We thus get 

AH = j dV - | dV, 

= J dV + J dV + J St/i°*//8i/.° dV. 
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On the other hand, we must have 

j (^°*+8^°*)(0°+80 0 ) dV - j 

= J 8<fr°*<l> 0 dV + J dV + J Si/»°*S0° riF - 0. 

Multiplying this equation by the value of IF corresponding to the 
function 0° and subtracting it from the first, we get, since ip° and ip°* 
satisfy the equations H^ 0 = Wip°, = Wip 0 *, 

\H = J 8^* (II- W)8\f>° dV, (58) 

which can also be written in the form 

|p8^T+(tf-*F)|8^°|*j dV. (58a) 

This expression can be considered as the second variation of H , since 
it is a small quantity of the second order. Its sign is, in general, 
uncertain: it may be positive for some variations Sip and negative for 
others. The values H = W given by the variational principle SH = 0 
must therefore be regarded as stationary and not as minimum or 
maximum values. The preceding results are simplified if we assume (as 
we‘are usually entitled to do when we are dealing with stationary 
states with no magnetic field present) that the wave function ip° is real; 
we need hardly however, restate them in this simplified form. 

The variational principle provides us with a very simple and important 
method for obtaining approximate solutions of SchrOdinger’s equation 
and determining the corresponding energy values—or rather for improv¬ 
ing such approximate solutions and energy values after they have been 
obtained by some other method;f Thus the variational method is useful 
in determining the motion due to a field of force which is slightly different 
from some simpler field of force for which the motion is supposed to be 
known. The solution of this question is one of the two main problems of the 
perturbation theory, the other problem being the determination of transi¬ 
tion probabilities which has already been considered briefly in Part I. 
We shall give a detailed treatment of the perturbation theory in a later 
chapter. At present we shall briefly indicate those of its results which 
can be obtained, in a straightforward way, by the variational method. 

t The method o! reducing the solution of a differential equation of the type 
//0o — W\fj° to a variational problem has been worked out by Lord Rayleigh and much 
later by W. Ritz in connexion with the problems of the vibration of elastic bodies, which 
are formally very similar to the problem of the motion of a particle in wave mechanics. 
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Let us suppose that, somehow or other, we have obtained a function 
<f>°(x , y,z;a) which we know to be capable of approximately representing 
one of the characteristic functions of the operator H provided the 
undetermined parameter a , contained in it, is suitably chosen. Then 
this particular value of a can be determined from the equation 



dll (a) ^ydE(a) 

(59) 


da da 

where 

H(a) = J ^*(x,y,z;a)H^(x,y,z-,a)dV, 

(59 a) 

and 

E(a) = J <f>°*<j) 0 dV, 

(59 b) 


in conjunction with the relation 11(a) = W, which gives the corre¬ 
sponding value of the energy. If the function is normalized to 1 (accord¬ 
ing to E — 1) for every value of a, equation (59) can be replaced by 
dll (a) j da = 0. 

This method, which is often used in practice, can be generalized to 
include the case when the function </>° contains many unknown para¬ 
meters a v a 2 ,..., a r , the closeness of the approximation in general 
increasing with the number r of these parameters. We come upon a 
particularly simple and interesting case of such an approximation in 
the perturbation theory of a degenerate motion, where we have, in 
the absence of the perturbation, a set of wave functions ^(x,y, z), 
$!(.r, y,z),---, 'l'r(%>y>z) representing different states of motion with the 
same energy W . Let us assume that the potential energy U has been 
replaced by U\ the difference U'—U corresponding to a small per¬ 
turbing field of force (for example, an external electric field of force). 
The energy operator H = p 2 /2m-\~U must then be replaced by the 
operator H' = = //+!/' — U, and the functions </rf, 

must be replaced by a set of r functions $!' referring to r 

states of motion with nearly the same energy, i.e. belonging to r energy 
values W[, W ' 2 ,..., W' r which are slightly different from one another and 
from the approximate value W corresponding to the absence of per¬ 
turbing forces (the latter are, of course, supposed to be independent of 
the time). Now the functions can be represented approximately as 
linear combinations of the functions ipl with unknown coefficients. 
Thus we may write r 

<pk' = 1 (60) 

the r coefficients a lk >, a 2k >,..., a rk > appearing in the expression of each 
function i/jf playing the role of the r parameters mentioned above. 
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Dropping the index 7/ and substituting the expression i/j 0 ' — * n 

the integrals 


we get 


V --- -- | dV and E' - J 0°'*^' dV 


H ' = 1 i. II 'ki a *<’t< («0a) 

k 1 II 

^'--=1 I F, kl a*a b (00b) 

k -1 l-l 

where r 

Jr u = lp k *H'tfdv, (oo c) 

E' a = ln*MdV. (00 d) 

The expressions (00 c) arc the matrix elements of the energy operator 
H' of the ‘perturbed’ motion with regard to the characteristic functions 
describing the unperturbed types of motion associated with the same 
energy W . Since these functions need not be orthogonal, the expres¬ 
sions E kl may be different from zero for h /- l. 

The variational principle SI l' — 0, together with the condition 
E' = 1, gives the following equations: 

diT = w ,dE’ dir ve' 

da* da*’ da t da t ’ 

± = 0 (k - 1, 2,..., r), (61) 


2 ^ki) a t — 0- (61a) 

1c- l 

The second group can be obtained from the first by a change to con¬ 
jugate complex quantities in conjunction with the ‘Her mi thin’ relations 
(Part I, §17) //'*=//), and E kl = E*, ' 

and therefore need not be considered separately. The compatibility 
condition for the r linear homogeneous equations (61) runs 

U[-W’E n H\-W'E 12 . . . H[-WE lr 

m-wE 21 m-wE. a . . . m r -WE ir = o 


\H' rl -W’E n H’ r2 — W'E r2 . . . H'„—W’E„ | 

This is an equation of the rth degree for W'\ its roots W[, IFo,..., W r 
are the required (approximate) values of the energy. The coefficients 

a Xk'-> a 2k'>"-> a rk‘ 


3595.6 
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corresponding to W' = W k > according to (61), specify, by means of 
equation (60), that type of perturbed motion which has the energy W k >. 
We thus see that the r types of unperturbed motion which have the 
same energy W and which are described by the functions i/rj! 
actually give rise, under the influence of the perturbation, to the same 
number of different types of motion, but these, in general, now have 
different energies WW' r . This phenomenon is denoted as the 
'splitting up’ of a multiple energy-level, by the influence of perturbing 
forces, into a number of 'sub-levels’. The Zeeman and Stark effects, i.e. 
the splitting of the spectrum lines under the influence of a magnetic or 
electric field, are examples of this. 

It should be mentioned that if the functions are orthogonal and 
normalized to 1, i.e. if E kl is equal to 0 lor k ^ l and to 1 for k ----- /, 
equations (61) assume the form 

£H’ kl a t = Wa k (*=l,2,...,r), (02) 

and the compatibility equation for determining the energy values 
reduces to 


H’n-W 


■ H'„ 



//:, 

u: a -\v . 

. H' v 

- 0 . 

(62 a 

H’n 

ir r2 . 

. . w„-w\ 




Equations (60), (62), and (62 a) closely resemble equations (48), (48 a), 
and (48 b) derived in § 7 for the determination of the characteristic 
values of an operator F which is a constant of a motion involving 
degeneracy. Actually they are identical, but this is slightly masked by 
a difference in notation. If we replace F by H\ reverse the role of the 
‘old’ and ‘new’ functions ip and «/>', replacing the i(j by i/j 0 ' and the if*' 
by i/j 0 , and in addition write H' kl instead of c kl and W' instead of c', 
then equations (48), (48 a), and (48 b) assume the form of (60), (62), 
and (62 a) respectively. This coincidence shows that the operators H 
and H' must commute with one another , i.e. that, to the degree of 
approximation obtained by the perturbation theory sketched above, 
the perturbation energy H'—H is to be considered as a constant of the 
unperturbed motion specified by H. 

This perturbation theory can easily be improved and generalized in 
such a way as to become what is called a transformation theory , the 
primary object of which is to derive exactly the characteristic functions 
and values of a certain operator IF from the characteristic functions and 
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values of some other operator //. The solution of this problem is given 
by the preceding equations if, in the first, place, we drop the assumption 
that the original (amplitude) functions $\ ip° belong to the same 
energy-level, and if, in addition, we increase r to infinity, so as to use 
the complete set of functions and energy-levels belonging to the operator 
II. Equations (60) and (61) or (62), in conjunction with (61 b) or (62a) 
will then determine the complete set of functions and energy values 
characteristic of the operator IV . Further generalizations of this trans¬ 
formation theory involving operators different from the energy and 
variables different from the coordinates will be examined later (Chap. IV), 
It should be mentioned here that the reduction of an equation of the 
form FiJj ™ Cip to a variational principle of the form 

3 ^ - 8 j dV = 0 

(with the condition J ipip* dV — 1 ) is possible not only when F is the 
energy operator If, but in the case of all operators which are ‘self- 
adjoint’, i.e. for which f l Ff 2 —f 2 Ff l — the divergence of some vector. 
Actually it is not necessary for the integral J ipip* dV to converge. The 
only assumption which it is necessary to make in older to obtain the 
differential equation Ftp — Cifs from the variational equation 8 F = 0 
is that E J i pip* dV should be constant ( 8 E = 0 ). 

10. Orthogonality and Normalization of Characteristic Functions 
for Discrete and Continuous Spectra 

The characteristic functions ip° obtained by the variation principle, 
under the condition f ip°ip°* dV — const., or by the direct solution of 
the equation Hip 0 = Wtp°, can form both a discrete and a continuous 
set corresponding to a discrete or a continuous set of energy values W. 
The energy values are therefore said to form a discrete or a continuous 
spectrum of the energy operator H. As we know from the general dis¬ 
cussion of § 15, Part I, and from the examples of the oscillator and 
the hydrogen atom, a discrete spectrum is associated with characteristic 
functions which—because of ‘total reflection’—vanish at infinity so 
rapidly that the integral J ip°ip°* dV converges. This makes it possible 
to normalize them to 1 by means of the equation J i p°ip°* dV -- 1 . The 
characteristic functions corresponding to a continuous W -spectrum may 
also—although not necessarily—vanish at infinity, but not rapidly 
enough (because of the lack of total reflection) to ensure the convergence 
of the integral J ip°ip°* dV, so that their normalization to 1 , or to any 
other finite value, is in this case impossible. 
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This relationship between the convergence or non-convergence of the 
integral J 0°</r°* dV (which is a measure of the probability of finding 
the particle somewhere in the whole of space) and the discrete or con¬ 
tinuous character of the energy spectrum is intimately connected with 
the relationship between the characteristic functions and i/;? n which 
are associated with or ‘belong to’ different values of the energy W n 
and W m . 

If the equation //<///, — W n f [) n which is satisfied by is multiplied 
by «///* and subtracted from the equation lift ** — W m ij r®* multiplied by 

4 ' 1 we gct i'lUK-KWn - 

Integrating over the whole space, and assuming the integrals J |i/^| 2 dV 
and J | ft} n \ 2 dV to be convergent, we get, because of the self-ajointness 
of the energy operator according to (51), 


(K,-W n ) S == 0 , 


and since W m W». 


j KM w ■ 


0 . 


(63) 


This is the ‘orthogonality property’ which has already been deduced 
for one-dimensional motion in § 17, Part I. As shown there, this pro¬ 
perty can still be retained even when the states are degenerate, i.e. 
when different functions ftl t and ip% belong to the same energy-level, 
provided these functions are suitably chosen as linear combinations of 
the original ones (if the latter do not already satisfy the orthogonality 
condition). If the energy values corresponding to different functions 
are distinguished by different indices, irrespective of whether these 
values are actually different or identical, the orthogonality relation 
(63) and the normalization condition J ft}* dV = 1 can be fused into a 
single equation r 

= (63 a) 


where h mn = 1 if m — n and 8 mn = 0 if m ^ n. 

It should be mentioned that the existence of degeneracy must be 
regarded not as a general rule, but rather as an exceptional occurrence. 
It only arises in a few cases in which the particle is moving in an 
exceptionally simple field of force. - Nevertheless, the simple types of 
the potential-energy function U corresponding to these simple fields 
of force are of great practical importance. 

As shown in Part I when discussing examples of motion in three 
dimensions, the different characteristic functions are specified by the 
values of three quantum numbers n v n 2 , n 3 , which, from the geometrical 
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point of view, give the number of nodal surfaces of the different kinds 
and which, from the dynamical point of view, specify the characteristic 
values of three operators F v F 2 , F 2 , representing three independent 
constants of the motion which is described by the corresponding charac¬ 
teristic function. The energy operator H can be defined as a certain 
function of the operators F v F.,, F .,, its characteristic values being 
equal to the same function of the characteristic values C Y ' } , C'^ y C'" la 
of these three operators. The existence of such operators is connected 
with the existence of ‘separable coordinates’ q v q 2y q 3 , these coordinates 
being such that each characteristic function of II can be represented as 
the product of three functions ^satis- 
fying the equations 


*«»,,,.(?*) (Ic = 1,2,3). 
*/'«, v, „S T > y< s ) ^ K v,v A (<hW„, », 


these become 


iHF l9 F i9 F^ Hgn ^ =- Ih(r; v 6-, 


(G4) 
(64 a) 

(64 b) 


where J1 T (6", 6"', 6"") is the same function of the numbers C\ C\ C" as 
II is of the operators F v F 2 ,1<\. 

In the approximate quasi-classical determination of the function j )j in 
the form e rl7TS ^ h , where S is the action function of the Hamilton-Jacobi 
theor}', the product relation (64 a) corresponds to the additive relation 


S 0 (x,y 9 z) - S'(q l )+8'(q 2 )+S'"(q z ) (64 c) 

which serves to define the separable coordinates in the classical sense. 
The quantum numbers 7 i v n 2 , ?? 3 are introduced by the condition that 
the periodicity moduli of & {k) (q k ) must be integral multiples ?? A . of //. 
The energy W(C f , C'\ C"') can be written as a function of the quantum 
numbers in the form W nn ^ We have degeneracy when the energy 
actually depends on only two or one of these numbers, or upon their 
sum—as in the case of a hydrogen-like atom, where we may assume that 
n 1 denotes the radial quantum number, n z = l the angular quantum 
number, and n 3 = m the axial quantum number, F 2 being the operator 
M 2 and F a the operator M z , and hence 

=-- P<,,M K.i.n.fa) = «'"*• 

It is always possible to arrange the triplets of numbers n v ti 2 , ?? 3 in 
a single row and to specify the functions ifi° and the energy-levels W 
by a single index n indicating the position of the corresponding triplet 
in the row. The indices n{\jP nt W n ) so obtained will, of course, have no 
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connexion with the quantum numbers. One can also use a kind of vector 
notation, writing n as an abbreviation for the three indices n v n 2 , n 3 . 
This is the notation used in § 17 of Part I, and we shall use it in future 
when dealing with states of motion belonging to a discrete spectrum. 

A continuous spectrum of the energy operator H arises when at 
least one of the three operators F, corresponding to the separation 
coordinates, has a continuous spectrum of characteristic values, the 
spectra of the other two operators remaining discrete (although of 
course they may be continuous too). This case occurs with hydrogen- 
like atoms in the region of positive energy values, i.e. in the region 
corresponding to the non-periodic (hyperbolic) motions of the classical 
theory. The wave functions can still, in this case, be written in 
the form of a product (64 a), the radial quantum number (n x ) being 
replaced by a continuously variable parameter. We may take as this 
parameter the characteristic values C' of the operator F 1 itself, or 
the values of the energy which it determines in conjunction with the 
quantized parameters C" and C'". It will bo convenient to use for the 
characteristic functions belonging to a continuous energy spectrum a 
notation similar to that corresponding to the discrete case, replacing 
the quantum numbers as indices by the characteristic values of the 
operators F and writing C as an abbreviation for the triplet C', 6"', C"\ 
so that the characteristic functions and energies are written i/j}, (x, y, z) 
and W* c respectively. If this abbreviation is not desired, it may be 
preferable to use a mixed notation involving continuously variable 
parameters as well as quantum numbers (e.g. the characteristic functions 
of the hydrogen-like atom can be written in the form where the 
energy W stands for the continuously variable parameter C'). 

It should be mentioned that a continuous spectrum corresponds to 
non-quantizable or partially quantizable motions that can be de¬ 
scribed quasi-classically, i.e. with an approximately determined action 
function S 0 , which is either single-valued, or has a many-valuedness of 
a kind restricted to one or two of the parts into which it is separated 
according to (64 c). The wave functions ifP c belonging to a continuous 
spectrum W c do not possess the orthogonality property which is 
characteristic of the functions iff° n belonging to the discrete spectrum, 
since, as we saw when deriving the orthogonality relation (63), this 
relation depends not only upon the self-adjointness of the operator //, 
but also on the convergence of the integrals J \ifj °\ 2 dV. These integrals 
converge for i/j° = i/£ but do not converge for 0° = if/},. 

The* connexion between the lack of orthogonality and the continuous 
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character of the energy spectrum can be illustrated by the following 
argument. Let us suppose that and ijj are two functions belonging 
to two different energy-levels and W c . Since the latter form a 
continuous series, their difference can be made arbitrarily small. Now 
if the orthogonality relation (63) applies to the continuous case, then 
the integral J ^ dV would jump diseontinuously from zero to 
infinity as we go from nearly equal values of C\ and C 2 (corresponding 
to nearly equal values of the energy) to the limiting case C x = C 2 . 

It should also be mentioned that—with the exception of a motion 
with one degree of freedom, i.e. specified by one coordinate only—the 
continuous spectrum possesses a degeneracy of an infinitely high degree, 
in the sense that each energy value can be associated with an infinite 
number of different states of motion, represented by different functions 
the case of a continuous energy spectrum it is possible, and 
indeed is often necessary, to consider not merely exactly defined states 
of motion corresponding to perfectly definite values of the continuously 
variable parameters C, but rather states of motion represented by a 
superposition of exactly defined states corresponding to a very small 
range AC of these parameters, i.e. by wave functions of the type 

[ 0c dC = (65) 

AC 


where the integration is extended over the range AC. The wave func¬ 
tions obtained in this way obviously represent a generalization of those 
functions which have been used in Part I to represent ‘wave groups’ 
or ‘wave packets’. In defining these generalized ‘wave-packet’ func¬ 
tions, we must take into account the time factor in the expression 
ip c = ijP c e-MirWoHh, s i nce the energy W c is also a function of C. So long, 
however, as the region AC is very small, the function (65) can be 
written in the form ^ = ^ (65 a) 

where C 0 denotes some arbitrarily chosen ‘point’ contained in AC, and 
<^ c is a certain function not only of the coordinates, but also of the 
time, representing the propagation of the wave packet. 

For various reasons, it is usually more convenient to consider the 
functions </>° LC at a particular instant t = 0 , in which case they can be 
defined by the integral M _ [ M Jn (65b) 


<t>lc = J PcdC, 


AC 


and to represent the inexactly defined states of motion for any time by 
the product of (65 b) by e~ i2ltWc ^ h . 
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Let us imagine that the whole region formed by the variable para¬ 
meters C (it may be a dine’, a ‘surface’, or a ‘space’—depending upon 
the number of continuously variable parameters in the triplet denoted 
by C) is divided into very small elements AG\, AC 2 ,..., AC W which do 
not overlap, and let us consider instead of the exact states the in¬ 
accurately determined states which are represented by the amplitude 
functions J tfi^dC (n = 1 , 2 , 3,...). These states can be associated with 


A( 'n 

a discrete set of energy values W n referring to certain (arbitrarily 
chosen) points of the corresponding elementary regions A C n . 

It can be shown that in the limiting case when the size of each region 
is decreased to zero (their number increasing to infinity) the functions 


ft = 


\/(AG r n ) 


/ 

A( 


dC 


(06) 


behave in the same wag as the ordinary amplitude functions 0 ” belonging 
to a discrete spectrum , i.e. in such a way that the integrals J flUfflhdV 
are convergent. This result follows from the oscillatory character of 
the functions $\ at large distances (see below). Since the functions 
(60) satisfy in the limit the same equation as the corresponding exact 
functions (for W — ll^J, it follows that they must be mutually ortho¬ 
gonal and further that they can be normalized to 1 , so that we can put 

JftfftrfF = (66 a) 

Let us consider, for example, the functions 

ip k ™ A(k)e ,2nkx f 

which describe a force-free one-dimensional motion with a momentum 
g = hk and a kinetic energy W = k 2 h 2 /2m. 

If we regard A as a slowly varying function of k, we get 


fcx-HAfc 


k\ i J A k 


ft = / ft dh = A(l\) f 


■ dh = A(h 1 )e i2r,l; ‘ x 


fci-JAfc 




sin 7 t A k x 

TTX 


We thus obtain, replacing the volume integration by an integration 
along the #-axis, 


+ 00 

J 1^1* = 



|/4(Aq)| 2 lim A& 


+ 00 



■00 


—co 




\A{h) | 2 = 1 . 


i.e. by (66 a), 
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It should be noticed that the normalizing condition only determines 
the modulus of the coefficient A(k). We can still multiply it by an 
arbitrary factor of the form eW k) . 

Likewise we find for two intervals A 1c x and A k 2 about the different 
mean values k l and k 2 : 

^0*^0 *A 2 c i 27 T ( k t~ k i)x &in 7 rA h x s * n 7T . x 

1 ' ~ 1 " 7 TX 7 TX 

If, for simplicity, we put A k 2 = Ak x (k 2 =£ k x ), then the integral 
~jr J dx assumes the form 

(f = »A kx). 

— oo 

When Ak 0 the quantity (k 2 —k 1 )/Ak becomes infinite and therefore 
this integral must in the limit be zero. These results can easily be 
generalized so as to apply to free motion in three dimensions, repre¬ 
sented by a wave function of the form 

ipl = ^4(k)e* 2,rk,r == A(k x , k y , k z )e i2n i k * x+k * v + k » £) , 
since this function is equal to the product of three functions repre¬ 
senting one-dimensional motions parallel to the three coordinate axes 
respectively, the integrals both with respect to k x , Jc v , k z as well as with 
respect to x, y, z thus reducing to products of integrals for the separate 
components. (It should be remarked that AC must be defined in thi, 
case as the product Ak x Ak v Ak a .) 

The general proof of the quadratic integrability of the functions 
(66) can be derived from a very simple physical consideration, namely, 
from the fact that, at very large distances, the motion represented by 
any function if/ c must approximate to a force-free motion, at least in 
all problems of practical interest for which the field of force determining 
the motion of the particle is supposed to vanish at infinity. 

Taking again the function \jj% = e i2nkx as a typical representative of 
wave functions belonging to a continuous spectrum (for the case of one¬ 
dimensional motion), let us consider the double integral 

J = JJ </$* dxdk 2 = JJ e*'27r<fci-**)x dxdk 2 , 

extended from — oo to +oo both with regard to k 2 and x. Since each 
of the simple integrals over k 2 and over x taken separately between 
these limits does not have a definite value, let us define the value of 

+00 k x \\k 

J as the limit of J' k — J dx J e i2ir(k *- kl)x dk 2 for k ~> oo, or the limit 

-co k t ~lk 
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-f-oo 4-^ 

of JI = J dk 2 J e i2n{k *- k ' )x dx for £ -> oo. In the former case we have 

* -oo 

k T e ^ Wdk2 = ^ x , 

*i -4* ^ 


sin 


l f = 

»j p 


independently of k, and therefore in particular for k — oo, which gives 
J = 1. In the latter case we get similarly 


e i2ir(k t -ki)x __ 


Bill 27r(k 2 
Tt(Jc 2 


+ oo +00 

and j; = f 8in2 +-^ )f d* 2 = I f sinp dp = 1, 

f J A^-K) » J P 

— oo — oo 

independently of f, and in particular for f = oo. The two definitions of 
J thus lead to the same result, namely, J = 1. 

Let us now assume that \p k = A(k)e i2lrkx , where ^4(fc) is some relatively 
slowly varying (non-oscillatory) function of k> and let us define the 
double integral +00 +ao 

J J KK dk *dz 

— oo —CO 

4-oo +£ 

as the limit of Jg — j dk 2 J t/»J* ip ki dx 


for £ = oo. Then since 


-t 


4 = f dk,, 

J TT^—kA 

— ao 


we get J = A*(k 1 )A(k 1 ) = |.4(A; 1 )| 2 . 

Hence it follows that the ‘normalization’ |-4(A; 1 )| 2 — 1 which has been 
derived above for the function ip% — A(k)e i2nkx with the help of (66) 
and (66 a) (with n = m — k) can be obtained just as well from the 

4-oo 4-00 

condition J j iffy* ^ dk 2 dx = 1. This result can easily be generalized 

— oo —00 

for any functions i/j° c belonging to a continuous energy spectrum, the 
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normalization condition of the usual type for the quasi-discrete functions 

fin = lim 


AO-M) 




k> S K 


dC, 


(AC.) 


(67) 


namely, J &°* dV = 1, 

being equivalent to the condition 

jjfiZfi>c,dC 2 dV=l. 

The latter is similar to the equation 

2 f 'IC'K w = i 

n J 

for functions belonging to a discrete spectrum. This equation is an 
immediate consequence of the normalization and orthogonality relations 

J 'Pm'Pn dV — S mn . 

It is possible to treat equation (67) in a similar way, i.e. to consider it as 
a corollary following from an orthogonality and normalization relation 
for the functions which, according to Dirac, can be written in 

the f0rm / fit dV = S(C 2 - <7+ (67 a) 


where 8 (C) denotes a somewhat unusual type of function, rather defined 
by the left side of this equation (together with the condition (67)) than 
defining it. As a matter of fact, this function does not depend upon 
the particular type of the function fyK so long as satisfies the con¬ 
dition (67) which reduces to 

J 8 (Cz-CJdC.^- 1 , 

or [ 8(C) dC - 1, (67 b) 

the integration being extended over all values of the continuously 
variable parameter (or parameters) C. 

It is obvious that for G = 0 (i.e. C 2 = C x ), the function 8 (C) becomes 
infinite. It seems, however, impossible to assign to it a definite value 
for C ^ 0 . Take, for example, the normalized function </$ = e i2nkx 
(with C = k). According to the definition (67 a), we have 

+ 00 

Hh-h) = j dx, 

— 00 
+ 00 

i.e. B(k) = | e it1rkx dx. 


( 68 ) 
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This expression has no definite value. We can, however, replace it, as 
we have actually done above in the evaluation of the integral J , by 

$ 6 (k) = j e i2 ” kx dx , (68 a) 

and pass to the limit £ -> oo a//er the completion of all the calculations in 
which the junction 8^(k) enters , and in particular after integration over 
k (which always forms a part of these calculations). The result will 
have a perfectly definite value, and indeed the same value as that which 
would be obtained by putting from the very beginning 

8(k) ^ 0 for k 0 ) 

and j 8(4:) (Ik = 1 j (68b) 

j TO | TO 

The above calculation of the integral J -~= J |* f l j* dk., dx for a func¬ 


tion of the type i//,\ = A(k)e i27Tkx , subject to the normalizing condition 
J ™ 1 , serves to illustrate these relations. 

We may thus say that the functions i p { j belonging to a continuous 
spectrum, though not orthogonal to one another in the strict sense of 
the term, can be treated as if they were orthogonal to one another and 
can be normalized according to the conditions (67 a) and (67 b) with 
8(C) = 0 for C ^ 0. 

The usual normalization f ip® ipT dV = 1 for a function belonging to 
a discrete spectrum is equivalent to putting the total probability of 
finding the particle under consideration somewhere in the whole of 
space equal to 1 . The normalization (67) or (67 a) can be interpreted 
as expressing the fact that the relative probability of finding the 
particle within a finite region of space containing the field of force 
in which it is moving is infinitely small compared with the pro¬ 
bability of finding it at infinity (where it moves practically as a free 
particle). Under these circumstances it is more convenient to normalize 
the total probability to infinity rather than to unity. This normalizing 
to infinity, corresponding to the relation (67) or (67 a), is equivalent 
to the usual type of normalization for the quasi-discrete functions 

- -r-j— j ifC c dC, each of which represents a kind of ‘frozen’ wave 
^y(AG) J 

AO 

packet. 
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11. Matrix Representation of Physical Quantities and Matrix 
Form of the Equations of Motion 

If a particle is moving in a constant field of force, defined by a potential 
energy U(x, y, z) which does not depend upon the time, its total energy 
W remains constant. A ‘conservative motion’ of this kind is described, 
in wave mechanics, by a particular solution of the equation (7/+p,)0 = 0 
of the type 0 = ip 0 (x ) y,z)e~ i2nlV(lh , where the amplitude function 0° and 
the associated energy constant satisfy the equation 7/0° --- ]f0°. If the 
particular solutions of the equation (II+p t )i/r — 0, where the Hamil¬ 
tonian II does not contain the time explicitly, form a discrete set 
corresponding to a discrete spectrum of If, then the general solution 
can be represented as a sum of these particular solutions with arbitrary 
constant cocflicients. Thus we may write 

<}' = 2 a n <hi = 2 (l n </'* e- (08) 

n n 

the functions 0“ being supposed to be so normalized that they satisfy 
the condition J |0"| 2 dV = 1. 

If the functions 0 form a continuous set, the summation must be 
replaced by an integration giving 

xjj = J a{C)*l>cdc J a ( 4" v c- ,2 ” n i» dC, (69 a) 


where C represents the continuously variable parameters. If some of 
the three parameters arc quantized while the others are continuously 
variable, the summation must be replaced by a combined summation 
and integration. Thus, for example, we may have 

<l> = 2 1 f >!’<■,„,dC\, (08 b) 

V 3 H., J 

the functions 0J, or 0J?. iWjJfi being so normalized that they satisfy the 
condition (07), and a(C) -- a c being arbitrary functions of the con¬ 
tinuous^ variable parameters C. 

jf— as i s generally the case—the energy spectrum consists of a dis¬ 
crete part W n and a continuous part W c , the general solution of the 
equation (7/+p / )0 = 0 is represented by a sum of (69) and (69 a) or 


(69 b), so that 


0 = 2 a n x f J n J T J #C0 C 

n J 

4 = 111 ^rhrtiWa zzf 


dC, 

l C t n 2 Vi •Ac.n.n, dC v 


(69 c) 
(69 d) 


or 
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We shall first examine the simplest case, i.e. the representation (69) 
corresponding to a discrete spectrum. As already explained in Part I, 
§ 17, the summation, from the point of view of the probability theory, 
expresses the alternative character of the motions represented by the 
different functions t p n or ip„. The resulting function if* can be normalized 
to unity in the same way as the separate functions *p ni i.e. it can be 
made to satisfy the condition 


J dV = 1. 


(70) 


According to (69), in conjunction with the orthogonality and normalizing 
relations J ^f n ^ n dV = S mn , it then follows that 

2 a n a Z =1 - (70a) 


The quantities a n a* = |«J 2 can be interpreted, subject to this condi¬ 
tion, as the probabilities of finding the particle in a state of motion 
specified by the function tfj. n , irrespective of its position in space. 

The probable (or average) value of any quantity represented by an 
operator F is determined by the general formula 

F = J dV. 

Putting <fr = 2 a H >p n , we get 



-F = 2 I a m a nKm, 

7ii n 

(71) 

where 

F mn = j4>W n dV. 

(71a) 

The F mn are the ‘matrix elements’ of the quantity F with respect to 
the states of motion i/j m and tp n . Putting 

= 

-- <li° m (x,y,z)e~ i2,,w ' M = </£, e-' 2 ™’-' 

5 

II 

we get 

F = F° e i27TVmnl 

A mn mn ° » 

(71b) 

with 

and 

_ w n -w n 

v nm v m v n fa 

(71c) 


(cf. Part I, §§ 17 and 18). 

So long as the operator F represents a real quantity, the matrix 
elements F mnt as well as their amplitudes, are Hermitian, i.e. they 
satisfy the relations 

F mn = Kn = F&. (72) 

These relations are directly evident if F is a (real) function of the 
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coordinates alone. To establish them for the general case, let us first 

h> d 

put F = p x — -. —-. We then have 

2 in dx 

F nm = f Kp x * m dV « J dV, 

and consequently 

n m -1 <P*p *.«rfF = - 2 7 i J ^r. 

Now | 4>n^r„, dV=--j -{'PAD dV- J dV, 

and since the first integral on the right vanishes, it follows that 

and so we get (72). The proof can easily be extended to any function 
F of the operators p x , p y , p z (and of the coordinates) not involving 
complex quantities (with the exception of the i in the expressions for 
p x which is necessary to make these operators correspond to real 
quantities). 

The relations (72) should not be confused with the self-adjointness 
relation (51) which, in the case of the integral (71a), runs 

j MJ'PndV = f ,p n F,p? n dV. (72a) 

It is equivalent to (72) only when 

F = F*> (72b) 

i.e. when F is a function of the coordinates alone, not involving the 
operators p x , p y , p z or involving them in even powers only. In the latter 
case, which is met with, for example, when F is the energy operator 
H = (pl+Pl+P;)/(2m)+U(x,y,z), 

the Hermitian relations (72) actually reduce to the relation (72 a) 
expressing the self-adjoint character of F. Putting F = H, we have, 
since = WA», 

H mn = W n \^A n dV. 

Taking into account the orthogonality and normalizing relations for 
the functions this reduces to 

H m n — H° mn = W n S m „. (73) 

We thus get by (71) 

(73 a) 


# = 2>X*r» = I K\*K- 
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This equation shows that if H is to be interpreted as the probable 
value of the energy, then the number |aj 2 must actually be considered 
as the probability of finding the particle in the state of motion repre¬ 
sented by the function ip n and associated with the exactly known value 
of the energy W n . 

Similar results hold for any operator F which represents a constant 
of the motion, i.e. which commutes with the energy operator. If 
there is no degeneracy, i.e. if the values of the energy W corresponding 
to different functions ip n are all different, then, as already shown in 
§ 7, it follows from the relation HF — FH that Fijj n = F n i fa, where 
F n is a constant, namely, the value of the quantity represented by F 
for the state in question. We thus get, in the same w r ay as before, 

■F*mn ^ 

and F ^ %\a n \*F n . 

n 

These relations can still be retained when there is degeneracy provided 
the functions fa, i/r 2 ,..., ip r forming a degenerate set, i.e, belonging to 
the same value of the energy, are so defined that they satisfy the rela¬ 
tions Fip n — F n ip n (this can always be done, as already shown in § 7). 
If they do not satisfy these relations, we have 

[cf. eq. (47 b), § 7]. Multiplying this equation by 0*, where ifi m is some 
function of the same degenerate set, and integrating, w r e get 

J F* k dv = ±c kl \ <A* <A, dv = c km , 

J i = l J 

since we can always suppose the functions to be orthogonal to one 
another, irrespective of the degeneracy. We thus get C km = F mk or 

( 74 ) 

If ip n is some function not belonging to the degenerate set i/j v fa,..., $ n 
it follows that 

F nk = f F* k dV = ±F lk \ dV - 0. 

J l-l J 

The general expression (71) thus reduces to the sum of the expressions 
2 a t a i^ki — 2 2 a * a i ^ii (74 a) 

f =]L l J = l 

taken for different values of the energy W. The relation 
follows from W k = W t . Thus, irrespective of the degeneracy, the 
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probable value of the operator F representing a constant of the motion 
is independent of the time. This independence of F of the time is there¬ 
fore the general criterion of the fact that F is a constant of the motion 
and commutes with H. If there is no degeneracy, it means that all the 
matrix elements of F must vanish with the exception of the ‘diagonal’ 
elements (i.e. those with two identical indices). In the presence of 
degeneracy this restriction is too narrow, the constancy of F being 
consistent with non-vanishing values of the matrix elements of F for 
all those states for which the energy difference vanishes. 

The relation (74) is a particular case of the general equation 

= ( 7s ) 

where the summation is extended over all the characteristic functions 
of H, irrespective of whether they belong to the same energy or not. 
This relation (75) holds for any operator F, and reduces to (74) when 
F is a constant of the motion. Equation (75) is derived in the same 
way as (74) by assuming that the function F\jj k can be expanded in 
a series of the type C kl ifj l with coefficients C kl which may be functions 

of the time but do not depend upon the coordinates.! This is equivalent 
to assuming that Fip° k can be expanded in a series of the type ]£ C ( kl ift j* 
with constant coefficients C kl . In the latter case we obtain, by multi¬ 
plication by ip°* and integration over the coordinates, 

J A Fft dV = I Cfc J‘ dV = C% m , 
i.e. C'U = Fl k , 

and | Ff k r- (75 a) 

From this equation it is possible to derive (75) (provided F does not 
contain the operator p t ) with the help of the relations ip k = ip k e + ’ l2lrv * t 
and Ffa = F lk e- i2lTVnt , where v lk = v—v k . 

If F is not a constant of the motion, the expression (71) for its 
probable value contains terms which represent harmonic oscillations 
with the ‘transition’ frequencies v mn = (W m —WJ/h. (The meaning of 
this fact for the emission of light has been discussed in Part I, § 17.) 
Taking the derivative of F with respect to the time, we get, according 
tO (7 1 b), j in 

JT == 2 2 a m a n 2 ™mn F mn> 
dt m n 

t This assumption can be justified for a very wide class of operators satisfying certain 
conditions which we shall not consider here and which are always fulfilled in practice. 

8595.6 X 
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or ft = 2 t 2 2 W ^ F ™- < 75 b ) 

w< n 

It can easily be shown that the right side of this expression is equal 
to the probable value of [II, JP], i.e. to 27 n(HF—FH)/h. We have in fact 
FH'I'n = FW n ifi n — W„ Fi/) n , 
and, according to (75), 

so that 

(HF-FH) mn - J MHF-FH» m dV 

= I Jin H'i / « dF -TF„ J K F+n dV 

= JJnn(Wm—W'n)- 

We may thus define the operator djF’/di by the matrix equation 

= < 7r "> 

If, in the preceding equations, we replace // by some other operator 
0, we get, by a twofold application of (75), 

(FG)t n = F^G kn 4< k = | = I 0*„ 2 

= 2 (2 -Jmfc G kn )lfl m . 

m k 

On the other hand, according to the same formula (75), we have 

= 2 (FG) mn <fi m , 

m 

where (FO) mn are the matrix elements of the compound operator FO . 
Therefore it follows that 


(FG) mn = ^F mk G kn . 

U we put F mk = fo ««-**«, <?*„ = GJ. 

and take into account the relation 


v mk~^~ v kn 


W m ~W k w k -w n 

h h 


W —W 

r, m rr n __ 

h 


v mn> 


(70) 


(76 a) 


we get (FO) mn = (FQ^e* 2 ™™*, with 

(FQfmn-in,kO% n . (76 b) 

This relation can be obtained directly by applying the operator FO to 
instead of tft n and using (75 a) instead of (75). 

It should be noticed that equations (76) or (76 b) coincide with 
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equation# of § 18, Part I, which were derived by combining the 
multiplication and addition laws for the probability amplitudes’ for 
transitions from a certain state m to another state n through some 
intermediate state k. The matrix elements F mk and G kn were inter¬ 
preted there as the 'probability amplitudes’ for the simple transitions 
m -> k and k -> n under the influence of perturbing forces characterized 
by F and G respectively, and the matrix element (FG) mn as the 
probability amplitude of a transition which is a combination of the 
preceding two with the intermediate state k remaining unspecified. 

We shall return to this interpretation in a later section. 

Equations (76) or (76 b) express, from a purely formal point of view, 
the multiplication law of matrices. This matrix multiplication law (i.e. 
combination of the rows of the first matrix witli the columns of the second) 
is quite similar to the multiplication law of determinants , which can be 
associated with the corresponding matrices. Hence the matrix of the 
operator FG is called the product of the matrices of F and G. 

Matrix multiplication is, in general, non-commutative, just like multi¬ 
plication (i.e. successive application) of the corresponding operators. 

It must be mentioned further that the products of two Hermitian 
matrices FG and GF are in general not Hermitian, the conjugate com¬ 
plex of ( FG) mn being equal to ( GF) nm . The two products are therefore 
Hermitian matrices only if they are identical, i.e. if F and G commute 
with each other. 

If, instead of the product of two operators, we consider their sum 
F-\-G, which is obviously commutative in the sense that 
(F+G)fs=(G+F)^ 

and form the matrix of this sum, we obtain the relation 

(F +£)„ m = F f , m +G mn = (G+F)mtn C'hc) 

which expresses the addition law of matrices , this matrix addition satisfy¬ 
ing the commutative law. 

It can easily be shown that, for three or more factors, the associative 
law is satisfied both for operators and for the corresponding matrices, 
just as for ordinary numbers, so that, for example, 

(EF)G = E{FG), 

and therefore 

[(EF)0] mn = 2 ( EF) mk O ka = 21 

k k T 

= I E m ,(FG) ln = [E(FG)] mn . 

We thus see that there exists a one-to-one correspondence between different 
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operators and the associated matrices , both with respect to addition and 
multiplication. This correspondence enables us to replace the operator 
representation of physical quantities, which we introduced in the pre¬ 
ceding chapter, by a matrix representation , each physical quantity, 
whether numerically expressible, i.e. having a definite value, or not, 
being represented by an array of matrix elements 


or 


Fxv 

Fi» 

*^13> 

^21> 

F it , 

F 23 j . 

Fsu 

' 

-^32» 

-^33 > • • * ! 

' 

r 11> 

VO 
x 12> 

VO 

* 13> • * * j 

Fix. 


VO 

-*• 23» • • • 

FSx. 

&? • • 

-f’Sa. • • • 


(77) 


(77 a) 


These will be denoted in future by single letters F and F° respectively, 
and will be used in exactly the same way as the operator representing 
the physical quantity in question, without direct reference to charac¬ 
teristic functions of any kind. 

It should, however, be kept in mind that such functions are indirectly 
implied in the very definition of the matrices F or F°, being the charac¬ 
teristic functions of the energy operator H. Referred to these particular 
functions, the energy is represented by a diagonal matrix 


i.e. 




Wx 

0 

0 . . . 

0 

w. 

0 . . . 

0 

• 

0 

w 3 . . . 

1 . . 

i„ = 

- KJK, 

111 

0 

0 . . .11 


0 1 0 
0 0 1 


(77 b) 


where 
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is the so-called ‘unit-matrix’, which in future will sometimes be denoted 

by 1 (8„m = l m «). 

The matrix elements of (77 b), i.e. the energy-levels W n , appear in 
the relations j? ___ jpo e i 2 n{\v m -w n )Hh (77 c) 

between the elements of (77) and (77 a)—the latter being simple 
numbers. The absolute values of the energy cannot, however, be derived 
from these relations, which contain their differences only. 

To distinguish the quantities F mn and F? nn , we shall call the F inn the 
matrix components and the F® in the matrix elements of the quantity F. 
For the energy as well as for any other constant of the motion, the 
matrix components coincide with the corresponding elements, so that 
we can then put jp jp o 


The representation of physical quantities by means of operators 
(including functions of the coordinates alone) differs from the repre¬ 
sentation by means of matrices in that the representation by operators 
is absolute , while the representation by matrices is relative. By relative 
we mean that the matrix elements of a quantity are defined with 
respect to a particular set of stationary states which are specified by 
the characteristic functions of a particular operator—or a system of 
commutable operators (like H, M z , and M 2 ). We shall see later that 
this distinction is not so fundamental as it seems. The operator repre¬ 
sentation given above is based upon the use of the coordinates (and 
the time) as the directly observable quantities. But this is not neces¬ 
sary. Certain other quantities—e.g. the momentum components—can 
assume the role of directly observable quantities. The coordinates then 
become represented as operators in terms of these new quantities. 
Leaving this aside, and retaining the variables x, y , z, t as the primary 
and directly observed quantities, we can maintain the above distinction 
as a fundamental one. 

Now it can easily be shown that the determination of the matrix 
elements of any operator F with respect to the characteristic functions 
of some other operator H (or of a system of three commutable operators) 
does not necessarily require an actual knowledge of these functions. It 
is in fact sufficient to know that they are such as to make the matrix 
of H diagonal. If, moreover, both H and F are explicitly defined as 
functions of the coordinates x , y, z and of the elementary operators 
Px> Pi/> Vz> then, taking into account the commutation relations 
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PxV-yPx = 0, ©tc., (78a) 

xy-yx = 0, PxPy—PyPx 0, etc., (78b) 

(in the matrix representation) we can calculate, «///? the help of the 
matrix addition and multiplication laws together with the condition that 
x , y, z, p x , p y , p z shall all be Hermitian matrices , the matrix elements 
both of H and J of any other non-diagonal matrix F. After the matrix 
elements of II and F have been determined, we can then calculate the 
matrix components of F (those of H coinciding with the elements). 

So far, therefore, as the determination of the matrix elements or 
components of any physical quantity with respect to the stationary 
states defined by some energy operator H is concerned, we can replace 
the solution of Schrodinger’s equation Ihfi 0 = Wifi 0 and the subsequent 
integration F° nn = J ifi°*Fifi n dV by the following problem: 

(1) To determine the matrix elements of the quantities x , ?/, z, 
Px'Py’Ps’ subject to the commutation conditions (78), (78a), (78b), in 
such a way that the matrix of the function H(x,y,z]p XJ p y ,p s ) shall be 
diagonal, i.e. that II rim = 0 unless n = m. 

(2) Knowing the matrices x, y , z, p x , p y , p c , to calculate the matrix 
elements (or components if the //-matrix is added to the list) of any given 
function F(x,y,z;p x ,p v ,p z ). 

In this way the functions \fi°, specifying the stationary states to 
which the matrix elements refer, can be completely eliminated from 
the matrix theory, and the latter built up as a closed and consistent 
theory, in the air, as it were, by the logical attraction of its elements, 
and not requiring the use of any ideas extraneous to it for its support. 

It should be noticed that the two parts of the above problem are, 
in a certain sense, reciprocal to one another—for in the first part 
we are concerned with the solution of a system of matrix equations 
for the unknown matrices x, y , z, p x , p y , p z , and in the second with 
the calculation of an explicitly given function of these fundamental 
matrices. 

In problems with one degree of freedom (corresponding to the motion 
of a particle in one dimension, such as the linear oscillator) the con¬ 
dition 6 II is a diagonal matrix 5 , together with the commutation condi¬ 
tions (78), etc., provides the basis for a complete and physically 
unambiguous determination of the fundamental matrices, e.g. x and 
p xi and consequently of the matrices representing, ‘from the point of 
view of H’ as it were, any other quantity F(x t p x ). It should be noticed, 
however, that there remains a certain ambiguity which is irrelevant 
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for the physical interpretation of the matrix elements, but which, as 
we shall see later on, is very important for the correct understand¬ 
ing of the relation between matrix theory and classical mechanics. If, 
in fact, x° mn and (p x ) { J nn are matrix elements which satisfy the condi¬ 
tions of the problem (or rather of its first part), then any elements of 


where <x n are arbitrary real numbers, will also satisfy these con¬ 

ditions, the elements of any other matrix F ( / \ l/l being replaced accord¬ 
ingly by F^ n e'< This result can easily be proved directly, or 

deduced from the original definition of the matrix elements in terms of 
the characteristic functions if we use the fact that each of them can 
be replaced by its product by e~ ioc » without any violation of the ortho¬ 
gonality and normalizing relations. This amounts to the introduction 
of an arbitrary ‘phase’ into ip n (putting ifj n = ^e“* (27n '" /,a " ) ) or ‘phase 
difference’ into F mn (putting F mn — F" mi e i ^' nv '*» t + 0L "- a -* ) ). 

The ‘phase’ constants a vanish in the diagonal elements F Q nn which, 
as we know, determine the average or probable value of the quantity 
represented by F in a stationary state with the energy W n . The phase 
constants also vanish in the products F Q mn F^f n , i.e. in the squares of 
the moduli of the matrix elements referring to different stationary 
states (W n 7 ^ W m ). These products determine the probability of a 
transition between the two states under the influence of a perturbation 
proportional to F. 

In the general case of motion in three dimensions, the condition that 
the energy matrix should be diagonal (together with the commutation 
relations (78), etc.) is not always sufficient for a physically unambiguous 
determination of the matrices x, y, z, p x , p y , p z , and it has then to be 
supplemented by a similar condition for one or two other matrices 
representing quantities which are constants of the motion, for instance, 
the z -component and the square of the angular momentum for motion 
in a central field of force. Such additional conditions are necessary in 
the case of degeneracy, the existence of which is revealed in the matrix 
theory, by the identity of several (diagonal) elements of the energy 
matrix. The matrices representing constants of the motion must of 
course—irrespective of the presence or absence of degeneracy—com¬ 
mute with the energy matrix, i.e. satisfy the relation 


{HF) mn = (FH) mn> 


which corresponds to the operator relation HF — FH. The multiplica- 
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tion law (76), together with the condition that H is a diagonal matrix 
(H mn = W n S mn ), give 

(HF) mn = | H mk F kn = 

(FH) mn = ^F mk H kn =.W n F mn . 

The condition that F is a constant of the motion therefore reduces to 


(W m -W n )F mn = 0, 

which means that F mn = ^ n , 

i.e. that the matrix elements of .F vanish for all states except those 
which correspond to the same value of the energy. Therefore, if there 
is no degeneracy, the constants of the motion must be represented by 
diagonal matrices. If there is degeneracy they may but need not 
necessarily have a diagonal form. 

The preceding result has already been obtained in a somewnat 
different manner [cf. (77 d)]. It should be remarked that a function 
f(F) of a diagonal matrix is itself a diagonal matrix, the elements of 
which are equal to the same function of the corresponding elements 
of the argument matrix 

[f(F)] nn =f(F n „). 

This follows from the fact that the characteristic values of an operator 
f(F) must be equal to the same function of the characteristic values 
of F. This result has already been stated when discussing the energy 
operator (§7). It can be obtained directly from the matrix multiplica¬ 
tion law which gives, when F is a diagonal matrix, 

(F Z )mn = | FmkFkn = F mm F mn = F* n S mB , 

(F*) mn = | (F 2 ) mk F kn = Fl n S mnl etc., 

so that, if f(F) can be expanded in the form 2 a k F k where a k are 
numerical coefficients, we have ^ a k F k ^j = a k F k ^jS mn . 

As has been pointed out at the beginning of this section, matrices 
representing real physical quantities must satisfy the Hermitian con¬ 
dition. The products of two such matrices F and 0 (unless they com¬ 
mute with each other) FO and OF cannot therefore represent a real 
physical quantity. Representation of real physical quantities can be 
obtained, however, by taking the sum of the two products, or their 
difference multiplied by i. In the first case we get, on dividing by 2, 
the ‘symmetrized’ representation \(FQ+0F) of the classical product 
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of the corresponding quantities. In the second case we get, with the 
additional factor 2ir/h, the bracket expression [F,G] which has bfeen 
already considered in § 8 and which corresponds to the Poisson-bracket 
expression of the classical theory. 

12. The Correspondence between Matrix and Classical Me¬ 
chanics 

The matrix representation of physical quantities was introduced by 
W. Heisenberg towards the end of 1925. A few months later SchrO- 
dinger’s wave-mechanical theory appeared, but nevertheless Heisen¬ 
berg, Born, and Jordan continued, for some time during 1926, to 
develop their ‘matrix theory’, without seeing any connexion between 
it and the ‘wave theory’. The connexion was finally discovered by 
SchrOdinger (and independently by Pauli) who found that the Heisen¬ 
berg-Born-Jordan matrix elements could be calculated from the wave 
functions by means of the formula F? nn ~ J </f°* Fi/j^ dV. This little bit 
of history serves to illustrate the fact that the matrix theory does not 
need a wave-mechanical support, but can be made completely ‘self- 
supporting’. We shall see later that the connexion between the wave 
theory and the matrix theory can actually be reversed in the sense that 
the matrix theory, in a generalized form due to Dirac and Jordan, 
contains the wave-mechanical theory as a particular case (§ 14). 

In his formulation of the matrix theory, Heisenberg was guided by 
Bohr’s ideas concerning the correspondence between the quantum and 
the classical description of the phenomena of radiation. In ‘the good 
old days’ before the coming of the quantum theory, atomic phenomena, 
and in particular those connected with the emission or absorption of 
radiation, were described in terms of a steady motion of the electrons. 
To this idea of steady (or continuous) motion, Bohr added the idea of 
transitions from one state of motion to another. In this way, between 
the years 1913 and 1925, physicists gradually became accustomed to 
considering two types of mechanical quantities—classical and quantum- 
mechanical. On the one hand we had, for example, the classical 
frequencies or amplitudes referring to the steady motion (analysed by 
means of a Fourier series into a sum of harmonic vibrations), while 
on the other hand we had the quantum frequencies or amplitudes 
referring to the transitions. 

By means of his ‘correspondence principle’, Bohr was able, in 1918, 
to establish an approximate relationship between the classical and the 
quantum-mechanical quantities. Advancing still further along the path 

3595*6 0 
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laid down by Bohr, Heisenberg rejected the classical quantities alto¬ 
gether, as devoid of physical meaning, and devised the matrix scheme 
(improved a little later by Bom and Jordan) for the direct calculation 
of the quantum-mechanical quantities. 

The correspondence principle can be explained in the simplest way 
for a one-dimensional motion, restricted classically to a finite region, 
e.g. lying between x f and x'\ and therefore periodic. The coordinate x 
of the particle can then be described classically as a periodic function 
of the time and expanded in a Fourier series of the form 

k~ + *> 


x(l) = ]JT x°(k)e i2irkvt , 

k- — op 


(79) 


where v~ 1/r is the fundamental frequency of oscillation (r is the 
period of oscillation, i.e. the duration of the ‘round trip’ from x' to x" 
and back again to x'), and x°(k) is the amplitude of the £th harmonic 
term having a frequency kv . The two complex terms with the fre¬ 
quencies ~\-kv and —kv must, of course, combine to form a real term 

of the type ^ cos 2 7 r |fc| v $ +b lkl sin 27r\k\vt\ 


it follows that the amplitudes x°(-\-k) and x°( — k) must be conjugate 


complex quantities 


x°(—k) = x°(+k )*, 


(79 ft) 


giving a m = x^+x^k)*, b m = i[x«(k)-x°(k)*]. 

Bohr’s theory, in so far as it was concerned with steady motions, 
restricted these motions by quantum conditions which, in the present 
case, reduce to the single equation 

J — <|> g dx — nh, (80) 

specifying the quantized values of the energy W — and hence deter¬ 
mining the fundamental frequencies v = v n . Putting g — N /{2m( W — £7)}, 
and differentiating the integral 


J = jj{2m(W-U))dx 

with respect to W (considered as a parameter), w r e get 


d£ 

dW 


= o 


i.e. 


dx 

^{2(W—U)fm}~ 

dJ 

dW 


m dx 

9 


dx 


= <P di. 


T, 


or 


dW 


dJ 


(80 a) 


This relation is a special case of the general relations between the 
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energy, the fundamental frequencies v v v 2 , v 3> and the fundamental 
moduli of periodicity J lf J 2 , J 3 of the action function S which were 
deduced, in an earlier chapter, for motion in three dimensions, with 
the help of the theory of canonical transformations (Chap. I, § 5). 

Although the 'classical’ frequency v given by (80a) refers to a steady 
motion, nevertheless it is expressed, as the ratio of the differences of 
W and J for two different , though closely neighbouring, motions as if 
it were associated with a transition between them. In fact the relation 
(80 a) bears a striking resemblance to Bohr’s frequency condition 


which gives the quantum frequency associated with a transition between 
two more or less widely different quantized’ states m and n. Intro¬ 
ducing the quantized values of the integral J, we can rewrite the 
preceding equation in the form 

W —W A W 

v mn = < 80b ) 

If W varies slowly with J , and if the quantum jump m—n is not too 
large compared with m or n, then the difference ratio A W/&J can be 
replaced approximately by the differential coefficient dW/dJ. From 
(80 a) we then get the following approximate relation between the 
classical and the quantum frequencies: 

l '„„, = {m—n)u. (80c) 

We may regard this relation as indicating an approximate coincidence 
or a ‘correspondence’ between the quantum frequency associated with 
a £-fold jump and the classical frequency of the harmonic oscillation 
of the order k (k = m—n). 

This correspondence between the classical and the quantum fre¬ 
quencies forms the nucleus of Bohr’s correspondence principle. The 
principle is extended by asserting that, in addition to this correspon¬ 
dence between the frequencies, there is also a correspondence between 
the amplitudes. 

Let us denote the functions x(t) for the nth stationary state by x 7l (/) 
and the expansion coefficients x°(k) by x"(k). Formula (79) then 
becomes +* 

*„(<) = I (81) 

k - -oo 

Writing m—n instead of k and putting 

x° n (m-n) = x° mn , 


(81a) 
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formula (81) becomes 


*■(*) = 2 a£„ ««*—*. 

m- -oo 


§12 

(81b) 


Now if the classical frequency (m~n)v corresponds to the quantum 
frequency v mn of the light emitted by the system under consideration 
(linear oscillator) as a result of the transition m->n (if W m > JF W ), 
then the classical amplitude x* nn associated with this frequency must, 
according to Bohr, correspond to the quantum amplitude of the emitted 
light, the correspondence being such that the intensity of the emitted 
light must coincide approximately with the intensity calculated classi¬ 
cally on the assumption that the motion of the particle (which is 
supposed to possess an electric charge without which there would be 
no radiation) is represented by the simple harmonic term 


x 


mn 


~.0 £,i2ir(m-n)vt 
x mn c 


The approximation with regard to intensity must be the closer the 
closer the approximation with regard to frequency. 

The ability of the correspondence principle to predict intensities has 
been verified in those cases where there is actually a close approxima¬ 
tion between the classical and quantum frequencies. For example, it 
was able to predict successfully the relative intensities of the neigh¬ 
bouring lines appearing in the Stark effect. Nevertheless the nature of 
the correspondence established by Bohr remained mysterious, until 
Heisenberg, towards the end of 1925, unveiled it in a way worthy of 
admiration both for its simplicity and for its boldness. Basing his theory 
upon the principle that only those things have a real existence which 
can be observed, Heisenberg put forward the idea that classical quan¬ 
tities do not exist at all, since they do not produce any directly observed 
optical effects. In fact the position and intensity of the observed 
spectrum lines can only be expressed in terms of quantum or transition 
quantities. 

From this point of view, the classical method of describing the motion 
of the particle by determining its coordinates for a given stationary 
state n as a certain function of the time x n (t), which could be expanded 
in a Fourier series (81b), was to be considered as an approximation 
to the description of the motion by means of a double array or matrix 
components of the form 




>i2irv mn t 


‘corresponding’ to the totality of the classical harmonic terms for 
different values of m and n in the same sense in which an approxima¬ 
tion corresponds to the truth. 
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At this point two different possibilities for reforming classical 
mechanics seemed to be open. The one consisted in assuming that the 
motion of the particle in a stationary state n can be described as a 
definite function 01 the time, namely, by the series 

*n( 0 = I ^nn^ 7TV - t > 

- » 

which should replace the simple Fourier series (81b), and that the 
equations of motion should be so modified as to lead to solutions of 
this new type instead of solutions of the type (81b). 

The second possibility was to assume that the classical description 
of motion, establishing a definite dependence of the position of the 
particle upon the time, had to be abandoned and replaced by a quantum 
description in which the coordinate x was to be determined as a matrix, 
made up of components of the type tf nn e, i2irv ”'* t . In this case the 
external form of the classical equations of motion could be maintained 
and only their physical meaning altered, the variables x , p x1 //, etc., 
being regarded and determined not as ordinary quantities but as 
matrices. 

With an unerring intuition Heisenberg chose the second way, thus 
giving up the very idea of motion in the classical sense (as being funda¬ 
mentally unobservable and therefore devoid of physical meaning) and 
laying the foundation of the new quantum or matrix mechanics. The 
idea that the quantum description of motion amounts to the deter¬ 
mination of quantities relating only to transitions between different 
states requires an important amendment, for besides such components 
a matrix contains diagonal components or elements relating to definite 
states taken separately. As we know, these diagonal elements are equal 
to the average or probable values of the quantity represented by the 
matrix for the corresponding states. This result, which has already 
been discussed in Chap. I, § 5, follows also from the preceding considera¬ 
tions connected with the correspondence principle. The time-average 
value of some quantity, e.g. x, as represented by a Fourier series (81), is 
obviously equal to that term of this series which does not depend upon 
the time, for which therefore k = 0. We thus have 

xjJ) = <( 0 ), 

or, using the notation (81a), 

x iSf) ~ rfm- 

Having defined every physical quantity as a matrix, Heisenberg 
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naturally enough replaced the usual multiplication law for ordinary 
numbers by the matrix multiplication law. In this he was guided by 
the necessity of securing the form 

F = F° e i2nv *”' i 

with the same transition frequencies v mn for the matrix representing 
any function F(x) as those which appear in the matrix (82 a) for the 
coordinate x. Taking, for instance, F(x) = x 2 and using the matrix 
multiplication law, we get 

(* 2 )m« = ^ x mk x kv = (|^ (* 2 )m» e' 2 ™””' 

as a consequence of the relations v mk — ( W m —W k )jh , v*.,, =•- (1 V k — W„)/h, 
Vmn = ( w m~ w r,)l h = cf. (76) and (76 b). 

Having introduced matrices to represent physical quantities and the 
matrix multiplication law for the calculation of matrices representing 
functions of such quantities, Heisenberg kept unaltered the form of the 
equation of the motion 2 

understanding by x and f(x) not the usual variables but the corre¬ 
sponding matrices, and put Bohr’s quantum condition g dx = nh in 
the form 

(gx—xg) nn = 

leaving the question of the non-diagonal elements of the matrix open. 
The commutation condition 

h . 

which also fixes the non-diagonal elements of this matrix (as equal to 
zero) was established by way of a generalization somewhat later by 
Born and Jordan, and still later was recognized (by SchrOdinger and 
Eckart) as giving the key for the transition from matrix mechanics to 
wave mechanics, this transition consisting essentially in considering x 

h d 

as an ordinary variable and g as the operator —. — and further in 

27 Tl CX 

replacing matrix equations by operator equations with the wave func¬ 
tion to be operated upon. 

The information obtained from the wave-mechanical treatment of 
a problem is more complete than that obtained from the matrix- 
mechanical treatment, for in addition to the matrix elements we obtain, 
in the former case, the wave functions which serve to determine the 
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probable location of the particle, its probable velocity, and so on. In 
the matrix mechanics the notion of probability with reference to 
separate states appears only through the diagonal elements, represent¬ 
ing probable values, while the non-diagonal elements can be interpreted 
under certain conditions as the probability amplitudes for transitions 
between different states. In Heisenberg’s original theory, the matrix 
components of the coordinate were looked for as quantities which 
determine the intensity of radiation or, what amounts to the same 
thing, the probability of transitions with emission of light, it being 
assumed that the intensity of radiation associated with the matrix 
component x mn — rf nn e i%7rVmnt is the same as it would be on the classical 
theory if x mn represented the actual motion of the particle as a harmonic 
function of the time. The result of this assumption is the same as that 
obtained in Part I in connexion with Sehrodinger’s theory of radiation, 
namely, that th^ probability of a spontaneous transition m n with 
emission of energy in the form of monochromatic light of the frequency 
v ma is equal (per unit time) to 


A — 


64t 7 4 i.f„, 

3 c z h 




where e is the electrical charge of the particle [Part I, eq. (93)]. 

In the preceding sketch of the development of Heisenberg’s matrix 
theory from Bohr's correspondence principle we did not attempt to give 
a direct proof of the latter so far as it refers to the connexion between 
the Fourier amplitudes and the matrix elements, having confined our¬ 
selves to the frequencies with respect to which the correspondence 
could be established by means of Bohr’s own theory. This gap can be 
filled with the helx> of wave mechanics, or rather that approximate form 
of it which has been discussed in Chap. I, § 5, and which corresponds 
to the classical mechanics together with Bohr’s quantum conditions. 

We have already used this approximate form of the theory for com¬ 
paring the classical time-averages (which are equal to the constant term 
in the Fourier expansion of the corresponding quantity F considered 
as a function of the time) with its probable values, defined by the 
integrals J Fip n dx , which are nothing else but the diagonal elements 
F nn — J™ n of the matrix representing F. We have found that to the 
approximation implied by the formula (23 a), § 4, 


~ >.T 


(82) 


where v n is the velocity of the particle (defined by the equation 
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v n = <J{2(W — I/)/™} as a function of its position x) and 
s n (x,t) = 8l(x)-W n t 

the classical action function for the state in question (with the energy 

r 

W n ), the classical time-average ~ J F(t) dt coincides with the probable 

o 

jr m 

value J Ftp* \f* n dx provided *p n is normalized to unity, that is, the 

x' 

coefficients c n are set equal to ^/(2/r). 

(I = |c„l 2 | dx/v lt = |cJ 2 Jr = l.j 

In a similar way it is possible to ascertain the approximate equality 
between the Fourier coefficients in the expansion of x(t), or any function 
of x supposed to be determined as a function of t according to the 
classical laws of motion, and the 'corresponding 1 matrix elements of 
this function F(x). 

In order to determine the Fourier coefficient x°(n) in the expansion 
(79) we multiply x(t) by e~ i27T,lvt and notice that the constant term in 
the resulting expansion is just x°(n). 

T 

We thus get x°(n) — - J x(t) e~ i2nnvl dt, 

0 

or, in the alternative notation corresponding to (81 b), 


T 

3"„ n = - J dt. 


The coordinate x can be replaced here, as just mentioned, by any 
function of x (or of x and g) giving 

r 

F n ma = - J F (t) dt. (82 a) 

0 

On the other hand, we have by the definition of the matrix elements 

= / r:m dx, 

x' 

or, according to (82), with s(x,t) = s°(a;)— Wt, i)fi n = /- -i—e iin ^ x)lh , 

V X Vt^nl 


F° nn = 2 - J F(x )e i! «MW-0 
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Now if the states n and m differ but little with respect to their 
energy, we can replace <J(v, n v m ) by a certain mean value of the velocity 
for an energy IF lying between W n and W nn and put accordingly 
dx/J(v n v m ) = dt just as in the case n — m. We have further under the 
same condition 


<(*)-<(*) 


ds Q (x. J) . j _ j v 
dJ ' 


where J is the action variable (80) (introduced in Chap. II, § 5, for the 
general case of a three-dimensional motion), and J n — nh , J m = mh 
its quantized values. In the case here considered of a one-dimensional 
motion the function s°(x) can be readily determined, from the equation 
g = d8°(x)jdx defining it, by the formula 

s°(x) j gdx = j J{2m(W-U)} dx, 
whence it follows [cf. the derivation of (80 a)] that 


ds°(x) 

dW 


/ 


dx 

V{2 (W-U)/m} 


s 


m dx 
<7 


f-f const., 


and consequently (dropping the irrelevant constant) 


/d<s°\ ds 0 dW __ dW 

dw'dj ” dJ' 


We thus get with the above approximation 


dW 

•<(*)-<(*) = t jj ( J n- 


or, since with the same approximation (J H —J m )dWjdJ — JF m , 

<(*)-<(*) = (W n -WJt. (82c) 

This gives, on substitution in (82 b), 



ir 

J F(t)e-<w w ’-- w ’Mh dt, 


0 

which coincides with (82 a) when we remember that 


(m-n)v ^ (W m -WJ/h. 


The preceding results can easily be extended to the general case of 
the motion of a particle with three degrees of freedom in a limited 
region of space. According to classical mechanics such a motion can 
be described under certain very general assumptions as a ‘conditionally 
periodic’ motion, which means that the coordinates, or any function F 
of the latter, can be represented as a function of the time by a triple 






According to wave mechanics, a scries of this kind, as a whole, will 
have no (or at least no exact) significance; the totality of the harmonic 
terms in all such series, corresponding to all possible states n v n 2 ,n 2i 
will, however, constitute an approximate expression of the matrix 
representing the quantity F . The exact expression of its matrix 
components can be obtained if we replace the classical frequencies 

(mi~-n l )v l -\-(m 2 ~n^v 2 -\-(m 2 .w 3 )v 3 by the transition frequencies 

and define ihe amplitudes F;; i(Wl . Wj .by the 
integrals f Ftp\\ ni „ a dV. The approximate equivalence of this 

definition to the classical one given above can be shown with the help 
of equations (32), (32 a), and (32 b) of § 5 in exactly the same way 
as before. 


One might be tempted to think that it would be possible to give a 
correct wave-mechanical definition of the quantity F as a function of the 
time by replacing the classical amplitudes and frequencies in the pre¬ 
ceding expression for F JliUiU ft) by the quantum ones, i.e. by putting 


,».>>,(() = III K, 


gj 27T(U mim2 ^ lj — Tf n 2 >i jtfl h * 


The fact that no physical significance can be attached to this ‘modified’ 
Fourier series is, however, clearly illustrated by the possibility of 
multiplying the functions n n by arbitrary phase factors 
resulting in the multiplication of the matrix elements by the phase 
factors which are completely irrelevant from the point 

of view of the wave-mechanical or the matrix theory, but profoundly 
influence the ‘modified’ definition of the function 2J „ ft). 


13. Application of the Matrix Method to Oscillatory and Rota¬ 
tional Motion 

The matrix mechanics of Heisenberg, Born, and Jordan can be con¬ 
sidered as a kind of ‘skeleton’ of SchrOdinger’s wave mechanics, com¬ 
plete in itself but nevertheless deprived of the flesh and blood of 
the probability conception, which forms the vital element of wave 
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mechanics. In addition, the wave-mechanical theory has another ad¬ 
vantage over the matrix theory, for, as a rule, it is easier to solve 
SchrOdinger’s equation for the characteristic functions of the energy 
operator and then to use these functions to calculate the matrix ele¬ 
ments of any other operator by means of integration, than to determine 
these matrix elements from the condition that the matrix of the energy 
is diagonal, together with the commutation relations for the coordinates 
and momentum components, without knowing or using the Charac¬ 
teristic functions at all. 

The practical application of the matrix theory to concrete problems 
can, however, be made much easier and more convenient if instead of 
carrying out the matrix representation directly with respect to the 

fundamental operator relations p x x—xp x = — t l, etc., together with 

2m 

the condition that 11(*,y,z;p x ,p v ,p z ) is diagonal, it is carried o it with 
respect to some other operator relations between certain moie com¬ 
plicated functions F, G, etc., the choice of which depends upon the 
character of the problem [i.e. on the potential-energy function U(x , y , z)\ 
if at least some of these functions commute with the energy, i.e. re¬ 
present constants of the motion. If 0 is such a constant (it may, in 
particular, coincide with the energy H), arid if some other function 
F (for instance, the coordinate a*) has been found which satisfies a 
commutation relation of the form GF—FG = ocF+fiG where a and ft 
are constant, the matrix interpretation leads very simply to the deter¬ 
mination of the matrix elements both of G, which can be assumed to 
be diagonal, and of F. Applying the matrix multiplication rule to the 
left side of the preceding equation, we get 

(GF—FG) mn = (G mm -G nn )F mn = ocF mn +pO n „ 8 mn , 
whence it follows that all the matrix elements of F vanish with the 
exception of the diagonal elements which are equal to 

F = —~G 

* itn nn 

OL 

and those for which = a. 

This equation leads very simply to the determination of the numbers 
Q nn —especially when n can be treated as a simple quantum number 
(and not as a set of several quantum numbers n 2 , n s all of them 
different from the numbers m 1 , m 2 , represented by m). By a suitable 
labelling of the states associated with given values of G, we can make 
those states for which the values of G differ by a successive, i.e. having 
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values of n and m differing by 1 , so that the preceding equation will 
reduce to (* tl+1 n+1 —G„ n — a. The solution of this equation is obviously 
of the form G un -- cm+y, where y is a certain constant. We shall not 
develop these general considerations but shall merely illustrate and 
amplify them by means of two special problems of outstanding sim¬ 
plicity and practical importance—namely, the problem of a linear 
harmonic oscillator and the problem of the rotational part of the motion 
of a particle in a central (radially symmetrical) field of force. 

The energy of a linear harmonic oscillator is expressed by the operator 
or matrix (as we please) 

11 — 2 ^/ j2 + i(2wo) 2 MZ 2 > ' ( 83 ) 

where v 0 is the natural vibration frequency of the classical theory. 
According to the matrix theory H has to be ‘diagonalized’ subject to 
the additional condition 

px—xp — 1, / (83 a) 

1 being the unit matrix. 

We shall put, for the sake of brevity, 

27 rv 0 mx = q, 2 mil =-~ K, hv 0 w — to, 

so that (83) and (83 a) can be written in the form 

p 2 +q 2 = K , pq—qi> — —ito, (83 b) 

it being understood that uj denotes the product of the factor hv 0 7 n and 
the unit matrix. 

We shall now introduce the matrices 

r = p-\-iq and s — p—iq (84) 

which are more convenient to deal with than p and q taken separately. 
Taking their product in the order rs , we get 

rs = pp+iqp-ipq+qq = p*+q 2 —i(pq—qp), 
i.e, rs = K—co. (84 a) 

Similarly we get sr — K+oj. (84 b) 

Hence, using the associative law, 

rsr = (rs)r = (K—a>)r } 
rsr = r(sr) = r(Jv+o>), 

I.e. putting K-m = L, = 2ra) (86) 

Now since K and o>, and consequently L, are diagonal matrices, we have 
(Lr rL) mn (JL mm -^nnfrmn 
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and (ra>) mn = r mn to, where a> denotes now not the matrix but simply 
the number hv 0 m, so that the preceding equation can be written in 


the form 
Thus either r„ 


L nn- 2 <*>K 


0. 


(85a) 


(85 b) 


(^rnm 

0, or L mm — L nn — 2oj. In the same way we get 
srs “ (A r -f oj)8 = s(K —oj) 

^ (L nim -L, m +2cj)s mn = 0, 

so that either s mn = 0 or L mm —L nn = — 2to. Now 

^mm ^rnm ^ nn 2 m(H mm ^nn ) ~ 2Wi(ff^ n ^«) 

is the difference of the energy-levels for the states m and n multiplied 
by 2m (m being the 7mm and not the label number of the state!). We 
thus see that the energy-levels must form an arithmetical progression 
with the difference 2oj/2m — hv Q , so that we can put 

W n — nhv^ const. (86) 

With this labelling of the stationary states we must have 
r mn — 0, unless rn = n +1 
8 mn = 0, unless rn = n—l 
The value of the constant in the expression for W n can be obtained 
from the condition that the lowest value of L nn must be equal to zero. 
This condition follows from the equation 


(86 a) 






- L n 


in conjunction with the fact that K nn cannot assume negative values 
because the matrix K represents an essentially positive or rather non¬ 
negative quantity, namely, 2m(p 2 +g ,a ) (with p and q both real). Hence 
we conclude that the series of stationary states must terminate with 
some state n mln which we can obviously label as n — 0 . The matrix 
elements r n n _ r and *„_ ln must obviously vanish for n < 0, since the 
states n ^ — 1 do not exist, whence it follows that L w = 0, or K ^ — a), 
and consequently — W 0 = lhv 0 , that is, 

W n = hv 0 [n- f|) (86 b) 

in agreement xyith the result obtained in Part I, § 13, by means of the 
wave-mechanical treatment of the problem of the linear oscillator. 

Further, for n > 0 we get 


' ~l,n 

Now from the definition of r and s according to (84) or 


2 mhv 0 n. 


’ «,«- 


s n~l,n 


1 — Pn,n-l~^~^n,n~l* 
Pn~l,n 


( 87 ) 
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We thus have 


together with the Hermitian character of the matrices p and q (which 
expresses the reality of the quantities represented by them), it follows 
that £ \ 

«»-l,n = <«-!• ( 87a ) 

We thus have |r„,„-il = |« n _ 1>B | = ^J(2mhv 0 n). (87 b) 

Coming back from r and s to p and q, we have p — l(r 
q — — \i(r—s), and consequently 

Pn,n- 1 \ r n,n- 1> Pn-l.n ~ 

9n,n~ 1 ~ —¥ r n,n-l> Qn-1 ,n ~ l^n-l,n ' 

all the other matrix elements p mn and q mn vanishing. 

^e thus get | i , nn _ 1 | = | ?f = J(\mnhv n ) (88a) 

and, returning to the original coordinate, x ~ qj(27rv 0 m), 


(ft iA" 0 m n ) ~ 2 w 0 « l7>,v " 


The latter relation between a: and p can be obtained directly from the 
equation p = mdxjdt , which gives 

= Wl2niv nk X n/ ., 

i.e. since = {W„-W k )/h = (n-k)u 0 , 

Pn.n -1 = 2 " r M'o*«, n -l- 

The derivation of the formulae (88) and (88 a) by the purely wave- 
mechanical method, i.e. through evaluation of the integrals 


-/ 


and p„ 


I 


where ^r m and are the normalized characteristic functions of the 
harmonic oscillator, would require a much larger amount of mefre com¬ 
plicated calculation. 

In the case of the hydrogen-like atom, the wave-mechanical method, 
on the contrary, proves much more simple and convenient than the 
matrix method for the determination of the energy values and the 
matrix components. The matrix method can, however, be applied with 
advantage in this case, as well as in the general case of the motion 
of a particle in any central field of force, for the determination of 
quantities which wave-mechanically depend upon the angular part of 
the wave functions only [i.e. on the spherical harmonic functions 

Here-belong in the first place the components of the angular momen- 
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turn M x , M y , M z , or rather their matrix elements with regard to states 
differing from each other by the values of the axial quantum number 
m (or also of the angular quantum number l) —including, of course, 
their characteristic values. 

The purely matrix determination of these quantities can be obtained 
most simply if one starts from the commutation relation 

MxM = - 

2ni 


which has been deduced in the preceding chapter with the help of the 
operator definition of the vector M. 

We shall put, for the sake of brevity, 


M r = 




M. 




so that the commutation relation above referred to assumes the form 
A B-BA iC, BC-CB =- iA, CA - A C - iB, (89) 

A , B, and C being regarded here as matrices. 

We shall introduce the matrix 


N - A 2 +B 2 +C 2 (89a) 

which (multiplied by h^/in 2 ) represents the square of the total angular 
momentum (M 2 ), a ltd shall show that it commutes with each of the 
matrices A, B, C (the proof is the same as if they were treated as 
operators). 

We have, namely, 

CA 2 —A 2 C - (CA-AC)A+A(CA-AC) - +i(BA+AB), 

and similarly 

CB 2 —B 2 C (CB-BC)B+B(CB~BC) - -i(AB+BA). 
Adding these equations to the equation CC 2 — C 2 C = 0, we get 

CN-NC = 0, (89 b) 

and in the same way AN—NA = 0 and BN—NB — 0 . 

Since, moreover, we know that N commutes with the energy matrix 
U , it must be a constant of the motion, and its characteristic values, 
together with the characteristic values of H, i.e. the diagonal elements 
of N and H in a matrix representation corresponding to characteristic 
functions of both H and N, can be used to specify the stationary states. 
We know, furthermore, that these characteristic functions can be chosen 
in such a way [by putting Y lm (6, <j>) = P, m {B)e im '*’] that one of the three 
matrices A, B, C—C say—shall also be diagonal (corresponding to 
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Ctft = const. </>). Using the results obtained before by the wave- 
mechanical method, we can thus define N and C as diagonal matrices 
with the elements , r in , . v 

N «J.m-,r,t. m =W+ 1) 

These results can be obtained independently by the purely matrix 
method, if we confine ourselves to matrix elements corresponding to 
the same energy values and assume both N and C to be diagonal 
matrices (which we obviously can do for the sake of simplicity, although 
this is by no means necessary). 

We shall consider first such matrix elements of A and B as correspond 
to states with the same value of N and shall distinguish these states 
accordingly by one index m only, specifying the characteristic values 
(i.e. the diagonal elements) of C. 

As in the case of the oscillator, we shall not consider A and B 
separately but in the conjugate complex combinations 

A + iB = R, A—iB = S. (90) 

Replacing the K of the oscillator theory by U, we have, according 
to (89), 

(A+iB)C-C(A + iB) = (AC-CA)+i{BC-CB) - -(iB+A), 
i.e. CR—RC = R, (90 a) 

and similarly CS—SC——S. (90b) 

These equations are of exactly the same form as equation (85) for rand 
the corresponding equation for s, the constant a> being replaced by \. 
We thus get, in the same way as before, 

C mm = Wt-f-const., (91) 

the non-vanishing elements of R and S being 
^m,m~ 1 an( l 

and having the same numerical value since 

*m.m- 1=^-1.*. (©la) 

The latter, together with the value of the constant in (91), can be 
derived from the equation 

RS = (A+iB)(A-iB) = A*+B*+C = A 2 + B*+ C*+l~(C*-C+l) t 
i.e. RS = N+i-(C-l) 2 . (92) 

Taking the diagonal elements of both sides, we get 

(RS) nm = S m _ hm = *)>, (92a) 

where N now denotes not the matrix N but the diagonal element of 


(89 c) 
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this matrix corresponding to the state in question (with no subscript 
mm affixed to it because it does not depend upon m). In a similar way 

we find (SR) mm = 8 mm+1 R m+lm = N+l-{C mm +D*. (92b) 
It should be remarked that the same expression can be written in the 
form (i?$) m+lm+1 , so that we must have, according to (92a), 

(^mwi i ) 2 === (^'m+l.m+l i) 2 > 

which is, of course, in agreement with (91). 

Now since A 2 +B 2 -\- C 2 ~ N, the characteristic values of the operator 
C or, what is the same thing, the diagonal elements of the matrix C 
must lie within certain limits, the maximum value C' not exceeding 
-\-N* and the minimum value C” being not smaller than — NK Denoting 
the corresponding limiting values of m by ra' and m" respectively, we 
must have 


— 0 anc ^ 1 — ^m 0 -l,m” ~ 

This gives, according to (92 b), 

-i+M+ih 


and, according to (92 a), 


= W(tf + i) = - (93) 

as would be expected from the fact that the relation A 2 -{- B 2 +C 2 — N 
determines the square of <7. 

The difference C m > m — C m * m * — m'—ra* is obviously an integral num¬ 
ber, I say, equal to the number of states with different values of C min 
which are possible for a given value of N. We thus obtain the following 


condition for N: 


2 V(^+i) = integer = 


that is, N = J(/ 2 ~l) - J(/+1)(J-1). (93a) 

This expression reduces to the usual form 

N = l(l+ 1) (94) 

if we put I — 2Z+1, i e. define I as an odd integer, giving for the 
limiting values of C rnm 


C„ 




^m"m’ ~ —h 


i.e. by (91) ra' = +1, ra* — — l . We thus get 


and consequently 
h 


m, 




2 T, 0 ™ 


= **», M* 

2tt 




(94 a) 
(94 b) 


in accordance with our previous results. It is, however, important to 

36Bfi.6 Q 
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notice that the matrix theory admits another possibility corresponding 
to I being an even integer, 2 k say. We get, in this case, 




N - 

(^+ i)(&—J) 

(95) 

and 

C m 'm - 

= k~l 

C m - m - = — (fc— 5), 

(95 a) 

whence 


Cmm = m +l 

(95 b) 

with m' = 

— k and m” - 

= k- 1, 

or 



<?mm = »»-§ 


with m. = 1) and m ” — k. These results can be put in the same 

form as the preceding results if we define l as a half-integral angular 
quantum number , __ , . 


and m as a half-integral axial quantum number, varying between the 
limits +1 and —l. 

We shall then get, as before, C mm — m. We thus see, by this example, 
that the matrix theory is, in a certain respect, more general than the 
wave-mechanical theory—at least in that form in which it has been 
developed hitherto. We shall give in a later chapter a generalization 
of it which provides an equivalent for the half-integral values of l and 
m of the matrix theory of the angular momentum. 

The non-diagonal matrices of the x and y components of the latter 
can easily be derived from (90), (91a), and (92 a). We shall not, how¬ 
ever, examine the matrices M x and M y separately, but shall examine 
their combinations 

2f.+ iM y = A R, M—iM = — S 

x v 2 n x v 2tt 


for the non-vanishing elements of which the following expressions are 
obtained , 

(M x +iM y ) m+ i. m = ^VW+!) 2 -(™+£) s }e , '“’" (96) 


(M x -iM v ) m ,m+i = (96a) 

where a m is an arbitrary phase factor. 

A derivation of these results by the usual wave-mechanical method, 
i.e. by means of the integral expressions for the matrix elements, would 
require a thorough knowledge of the spherical harmonic functions 
and would be much more laborious than the preceding 

calculations. 

The preceding method can also be applied to the calculation of 
the matrix elements of the coordinates x , y, z and momentum com- 
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ponents p x , p y , p z —for such states at least as, differ from each 
other in the quantum numbers m and l only (and which in the gase 
of the hydrogen-like atom belong to the same energy-level). To do 
this we shall examine first the expressions M z x—xM z , M z y—yM z , and 


M z z — zM z . Since ~~ (M z x—xM z ) ~ [M z ,x\ — y and M z ~ xp y ~~yp x , 
we get [M z ,x] = —y, and in the same way [M z .y] =■ -f,r, [M z ,z\ — 0. 


Putting 


x + iy ----- £< x—iy = 7) 


(97) 


we thus have \M Z , £\ ■- — y-\-ix -■ i(x \~iy) ----- 

[3/-, rj] —y — ix — — i(x—iy) -- ~irj 

or, with M z ~ hC/2ir, 

(%-£C - Crj-rjC: : -71 (97a) 

and Cz—zC — 0 . (97 b) 

ft follows immediately from these relations that, so far as the quantum 
number m is concerned (/ being left undetermined), z is a diagonal 
matrix with non-vanishing elements z mnr while f and rj are matrices 
with non-vanishing elements of the form 


£m,m -1 and 7) m 

as in the case of the harmonic oscillator. 

Let us consider now the commutation relations between the quan¬ 
tities (operators, matrices) rj on the one hand and R, S on the other. 
We have 


[ M x +iM r e\ = [M r ,t\+i[M„e j 


and similarly 
so that 


= [M„ x]+i[M T , y\ 

= + ---- t(-a+s) - 0 . 

\^Py <'Px / 

[M r ~iM u ,$]=r- — 2iz, 

m-ZR - o, 

8£-SS~ ~2z. 


(98) 
(98 a) 


From the first of these equations we get 

(R£), n +l im -l ~ 1 “ £m+1,m ^m,m 1 ’ 

j e — const. = a, 

\,m 


and likewise from (98 a) 

= (^) mw (^)mw ” m-1, m ^tn,m+l £m+l t m 

= a(lf /Mim _i & m m + l]* 
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We thus see that the non-vanishing matrix elements of the co¬ 

ordinates are determined, disregarding an irrelevant proportionality 
factor, by the matrix elements of the angular momentum. Substituting, 
in the preceding equations, the expressions for and S m _ lm 

derived before, we get 2 z mm = a{[(/+i) 2 —(m—i) 2 ]—[(i+i) ! —(m+|) 2 ]}, 
i.e. 

j. (98b) 


am 


If, 


m+l,m I 


k 


m,m+1 




In deriving these results it was tacitly assumed that the total momen¬ 
tum remained invariant, i.e. that the angular quantum number l pre¬ 
served the same value in the different states to which the matrix 
elements (98 b) refer. Affixing the index l , we should have written the 
latter in the more complete form etc. 

In order to find out the matrix elements which correspond to different 
values of Z, we must take into account certain commutation relations 
containing the matrix of the total momentum, or its square N (X h 2 /4:7r 2 ). 
Taking, for instance, the relation 

NR-RN - 0 


(which follows from NA—AN = 0 and NB—BN = 0 ), we have, since 
N is a diagonal matrix with regard both to Z and m (as a matter of fact 
not depending upon m), 


(NR- 


V m ' 




' r,m 0 ) 


- RN)r i9n '. rtmr 

4 7/4 

~ ( ^IT ~ ^IT ) 1m” ~ Q* 

We thus see that vanishes unless V — Z" as was assumed 

above. This assumption is therefore justified so far as the components 
of the angular momentum are concerned (it can be proved in the same 
way for S and C). It need not, however, hold for the coordinates, 
i.e. for the matrices 77 , z. 

Taking, for instance, the (l\ m'\r, m ff )-element of (98), we have 


2 ^ 

m"' 

Now it can easily be seen that the results derived from (97 a) and (97 b) 
as to the non-vanishing elements of 17 , and z , so far as they are 
specified by the quantum number m, remain valid irrespective of the 
equality or inequality of the numbers V and 1* (since these results 
depend solely upon the diagonal character of C with regard to m). The 
preceding equation need therefore be examined only for the case when 
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m" = m'—2. Putting m' = m-f 1 and m" = m—1, we get 

r.m 6',m:f',m-1 Rr,m;r,m-V (®9) 

The angular quantum number Z represents the maximum absolute value 
of the axial quantum number m. This means that the matrix element 
Ri‘,m+i t v,m vanish unless both \m\ < V and |m+l| < Z'; likewise 
1 vanish unless |m| < V and \m—1 | < Z", further 
will vanish unless \m\ and \m+\ | < Z', and finally R r , m -,r,m~i 
vanish unless \m\ < V and \m—1 | < V. Since equations (99) must 
hold for all values of ra, both sides vanishing simultaneously, we can 
conclude that V and 1” must be connected with each other in such a way 
that the violation of one of the conditions 


|m|<Z', |m+l| <Z', |m-l|<Z" 

will entail the violation of one of the conditions 


\tn+l | 


Z\ 


M < l\ 


-II < l". 


This will obviously be the case if V = Z", or V = V- fl, or V — V— 1. 
We thus see that only those matrix elements of f will be different 
from zero for which r _ P = 0 > +I,-l. (99a) 


For otherwise we could, by a suitable choice of ra, make one side of 
(99) vanish while the other would be different from zero. 

The same applies, of course, to the matrix elements of rj and z, or, in 
other words, to the matrix elements of all the three coordinates. 

Putting in (99) V = l and V = l— 1, and replacing the matrix ele¬ 
ments of R by their expressions (96), we get 

= \W ^) 2 

or 

m-f 1)(/ — m)}£ lm . = V{(^ + m “ 1 )(J~w)}6,m+l; 

Replacing here the common factor by ^(Z+ra), and taking 

into account that the expression (Z+ra-f l)(i+m) is obtained from 
(Z+ra— 1 )(Z+ra) by replacing m by w+1, we can put 

= byj{(l+m)(l+m+1 )}, (100) 

where 6 is a proportionality coefficient which does not depend either 
on l or on ra. 

Substituting this expression in the equation 

— i + i tin = Si t m; /, m +1 1; l-l,m 
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which follows from (98 a), and putting 

— ■Rljnil.m -1 = VW + 1 ) 2 “ ( m ~ \ ) 2 }> 

we get ~ -b<J(l 2 -m 2 ), (l()0a) 

In a similar way for the case l' — l— 1, V = l we obtain 

= b\l{(l-rn)(l-m-1 )} (100b) 

z l-\.mJ.m = ( 100 C) 

where b' is another coefficient of proportionality, which can be shown 
to have the same numerical value as b . 

It is interesting to compare the preceding resultsf with the wave- 
mechanical method for the determination of matrix elements of the 
coordinates for a hydrogen-like atom. 

We have, for instance, 

or, putting — f N (r)Pi m (d)e im t, dV -- r 2 drdco, daj — sin 6 dOdfj), and 
^ = rcosO, 

00 IT 2TT 

z u,l,m; »Xm' = / fn( r ) r3 dr / P lm( 9 ) P Vm( 9 ) cos 9 sin 8 d9 j d<f>. 

*0 0 0 

We see, first of all, that on account of the last factor this expression 
vanishes unless m' = m. In addition it can be shown that the second 
factor also vanishes unles V = Z^l. The proof is based on the fact 
that the product cos dP lm (d) can be represented as the sum of two 
functions P M. m (0) and J^„ 1>w (^) with suitably chosen coefficients, and 
on the orthogonality of the functions <f>) corresponding to different 
values of l [as characteristic functions of the operator £2 2 with the 
characteristic values —-Z(Z+1)J. 

Replacing z by f = (x+itj) = r sin 0(cos <£ + isin<£) = rsinfle^, we 
get, in a similar way, 

L,t.m : «V,m' = //,.(»> 3 dz j P l m ( 9 ) P lm'( 9 ) Sin 2 # d9 J ^ d 4>- 

0 0 0 

The examination of the last factor shows at once that this expression 
vanishes unless m ~ w— 1; the second factor vanishes likewise if 
l f Izt, I- 

The conditions relative to m coincide with those obtained by the 
matrix method for 2 and £; the condition V — 1 is, however, more 

restrictive, since it excludes the case V = 7. 

We see that here again, as for the values of 7 (integral or half-integral), 

t Derived in the above way by Bom and Jordan. 
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the matrix method leads to results of higher generality than the wave- 
mechanical method. It should not be inferred that the results obtained 
by the latter are incorrect. On the contrary, it is the results obtained by 
the matrix method which require some qualification. The reason for 
this is that the properties of the matrices which represent the com¬ 
ponents of the angular momentum of an electron are not completely 
specific, but, as we shall see later, are shared by matrices representing 
allied quantities of a more general character, which can be considered 
as the resultant of the angular momentum due to rotation about a fixed 
centre and the so-called intrinsic angular momentum’ of the electron, 
whose origin is usually ascribed to its spin motion. 

It is possible to generalize the wave-mechanical theory in such a way 
as to interpret this ‘spin effect’ and to incorporate the intrinsic momen¬ 
tum, allowing for the resultant angular quantum number or, as it is 
called, the ‘inner quantum number’ j both integral and half-integral 
values and allowing transitions, i.e. non-vanishing matrix elements of 
the coordinates, for which this number changes by i 1 or remains con¬ 
stant. This does not, however, invalidate in the least the fact that the 
angular quantum number l, representing the ‘orbital angular momen¬ 
tum’ of the particle, can assume integral values only and obeys the 
restricted ‘selection rule’ V—l == ±1. 

The fact that we have obtained, by the matrix method, non-vanishing 
expressions (98 b) for the matrix elements of the coordinates in the case 
l'—l = 0 does not contradict the wave-mechanical theory, for these 
expressions contain a proportionality factor a , which has not been 
specified and which can easily be shown to be equal to zero in the 
case considered (if l denotes the orbital and not the total angular 
quantum number). 

The matrix elements of the coordinates which we have calculated 
have a direct and indeed very important physical significance. They 
determine, according to the formula 


3c 3 


V 3 ' 6 
y nn'y 

h 


where e denotes the electric charge of the particle, the probability of 
a spontaneous transition with emission of light, i.e. they determine the 
intensity of the different lines in the emission spectrum of the corre¬ 
sponding system or the degree of their ‘blackness’ in the absorption 
spectrum [see Part I, § 13]. Such pairs of states n, n’ for which the 
matrix elements x^ n > vanish do not combine with each other, in the 
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sense that transitions between them connected with the emission or 
absorption of light, corresponding to oscillations in the ^-direction, that 
is to say, ‘polarized’ in this direction, are impossible. The relations 
between the quantum numbers which characterize the ‘allowed’ transi¬ 
tions (corresponding to the non-vanishing matrix elements) are called 
‘selection rules’. The latter, as we have just seen, can be different for 
different coordinates. For instance, in the case of the z-coordinates 
(i.e. of light polarized in the z-direction) they amount to V—l = ±1 
and m' — m, while in the case of the x, y-coordinatcs they are V—l ~ ± 1 
and rn' — rafl. 

This distinction between the different coordinates is a purely formal 
one in the case of a radially symetrical held of force—because of the 
degeneracy connected with such a field. This degeneracy—with respect 
to the different values of m —can be eliminated, as will be shown later, 
by the presence of a magnetic field parallel to the z-axis (Zeeman effect). 
If the latter is weak enough, the preceding expressions for the matrix 
elements of z and of x^iy will remain approximately valid and will 
determine the intensity of the spectrum lines linearly polarized in 
the direction of the magnetic field or circularly polarized about this 
direction. 

14. Matrix Representation in the Case of a Continuous Spectrum 

We have limited ourselves hitherto to the matrix representation of 
physical quantities where the states concerned form a discrete set, 
corresponding to a discrete spectrum of the energy operator //. 

The case of a continuous spectrum corresponding to a continuous or 
‘mixed’ set of states specified by functions of the type 4*c or 'f t c[n i ns etc, 
(§ 11), can be dealt with in a similar manner. The matrix elements 
of any operator F are defined in this case in exactly the same way as 
in the preceding case, i.e. by integrals of the form 

n- c - = f wm-dv (ioi) 

or ^tillin',; C'nln; = J 'I'ctnW. (101a) 

and so on. 

These integrals as a rule do not converge, and are similar to the 
Dirac function h(CV—C rf ) which was introduced and discussed in § 10, 
and to which the matrix elements of F actually reduce if F repre¬ 
sents the energy H or any other constant of the motion commuting 
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with H and satisfying the equation Fip c = F c tp c . We then get, accord- 
ing to (101), . 

Fl, c .-F c .fW+i,.dV, 


that is, 


F 0 c'c- - Fc-SlC'-C). 


(101b) 


This expression corresponds to a ‘diagonal matrix 1 of the discrete case, 
just as 8(C' — C") corresponds to the unit matrix. 

The somewhat indefinite character of the matrix elements F^ c » can 
be removed in the same way as in the simplest case F = 1 when F^> c » 
reduces to the function 8(C'—C n ) —namely, by extending the integra¬ 
tion in (101) over & finite volume, and passing to the limit V -> oo after 
completing the integration over C' or C” which always occurs in 
problems of physical interest.! The simplest example of such a problem 
is the calculation of the probable value of some quantity F for a motion 
specified by a wave function of the type 

ip — j a c ip c dC , (102) 

which can be considered as the superposition of a large number of 
‘wave packets 1 corresponding to very small intervals of the parameter 
C . Although the integrals J \tp c \ 2 dV diverge, the integral j \ip\ 2 dV 
remains in general finite and can be normalized to 1, just as in the 
discrete case when ip — J c n l Pn 

We have in fact, reversing the order of integration with respect to 
V and C, 

J |*|* dV = j a*. dC' J a c . dC J </£,*(.. dV 

V C' C V 

= J at, dC' J a c . dC" S y(C'-C"). 

O' c m 

Instead of first performing the integration with regard to C f and C" 
and then passing to the limit V -> oo, we can in this case replace the 
(perfectly definite) function at once by the Dirac function 

8 (C'— C") 3 which gives 

J 1 * 1 * dV = J a* c .a c , dC’. (102a) 

We thus see that the first integral converges along with the integral 


| In some coses it is preferable to modify the definition of the wave functions 0 so 
08 to make them vanish on a certain surface S beyond which the forces can be assumed 
to vanish. The problem is thus reduced to one characterized by a discrete spectrum. 
Such quantities as possess a direct physical interest are usually only slightly affected by 
the value of the volume V enclosed by S, so long as it is sufficiently large. Their exact 
values can be easily calculated by passing to the limit V -► oo. 

3593.0 R 
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l \a c \ % dC. The convergence of the latter can, however, always be 
secured by a reasonable choice of the function a c . The normalization 
condition thus reduces to the equation 

j\a c \*dC= 1, (102 b) 

which replaces the equation \a n \ 2 — 1 of the discrete case, and shows 
that the product | a£< |* dC (102 c) 

can be considered as the probability that the particle is in a state of 
motion specified by the interval (C y C+dC). 

The expression (102c) is of the same form as the expression \ip \ 2 dV 
for the probability of a position specified by the volume element dV\ 
in both cases we have to deal with continuously variable parameters 
(C or the coordinates x, y, z ), and therefore in both cases it has a 
meaning to talk of probability with reference not to a definite state or 
position, but to a definite interval of states or positions, the probability 
in question being proportional to the magnitude of the interval. 

Subject to the condition (102b), the probable value of a quantity F 
can be defined by the usual formula 


J t'Fxj, dV, 


which can be rewritten in the form 

F = j a*. dC' J a c . dC" J F^ c - dV, 

C' C m V 

i.e. jF = JJ a*.a c *F cc . dC'dC*. ( 103 a) 

In the simplest case, when F represents a constant of the motion, we 
get, according to (101b), 

F = f \a c \*F c dC, (103 b) 


in agreement with the above interpretation of the product |a c | 2 dC. 

If, however, F is not a constant of the motion, the integral (103 a) 
representing its probable value cannot be evaluated directly and we 
must have recourse to the method indicated above (first integration 
over finite volume, then over C or both C" and C ', and finally passage 
to the limit V -> oo). 

If the ‘C-space* is subdivided into infinitely small intervals AC', AC*, 
etc., and a wave packet is built up for each interval, according to the 
formula 

c = lini 

AC 



(104) 
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we can replace the matrix components of F with respect to the func¬ 
tions \p c > by matrix components with respect to the ‘quasi-discrete* 
functions \ft c > (normalized to unity): 

Fee’ — J fc* Fife* . (104 a) 

The connexion between these matrix components and those discussed 
above is given by the formula 

= U »V(Ai?40-) J J "oo-dCdC-, (.04b) 

AC' AC' 

whence it follows that the probable value of F can be written in the 

form ~ _ _ _ 

F = lim J 1 J(AC'AC”)F c , c .a*,a c (104 c) 

AC' AC' 

The matrix components—or elements—of a real quantity with respect 
to states of a continuous set must, of course, satisfy the Hermitian 

relations F 

* c c* — £ c m c* 

just as in the case of a discrete spectrum. 

‘Continuous matrices* cannot be conveniently represented by a square 
array of elements or components, such as are used for discrete matrices. 
This, however, does not invalidate the analytical results which have 
been established in § 11; the only amendment which they require con¬ 
sists in the replacement of the unit matrix B mn by the Dirac function 
&(C f — C") and of summation with respect to discretely variable indices 
by an integration with respect to the continuously variable indices 
wherever the latter occur in the place of the former. 

This has already been illustrated by the preceding examples. In a 
similar way we get instead of (75) 

F4> c . = | F C . C 4 C . dC", (105) 

and instead of (76) 

(FG)c'c' — J F c . c G cc .dC (105 a) 

(multiplication law for continuous matrices). 

The seemingly unimportant formal difference between the continuous 
(or mixed) and discrete case is connected, however, with a fundamental 
difference in the physical meaning both of the wave functions and of 
the matrix elements. The essence of this difference consists in the fact 
that, while to states belonging to a discrete set there corresponds in 
classical mechanics periodic or quasi-periodic motion in a limited region 
of space, states belonging to a continuous set correspond to aperiodic 
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motions of the classical theory, i.e. to types of motion for which the 
kinetic energy remains positive at infinity and which approximate there¬ 
fore at infinite distance (so far as the forces vanish there) to free motion. 

Motions of this type were not considered in the old quantum theory. 
The latter did not encroach upon the holy laws of classical mechanics, 
but merely added to them certain quantum restrictions when the motion 
was confined to a limited region of space and accordingly displayed 
certain periodicities corresponding to the many-valuedness of the action 
function S. As already shown above, Bohr’s quantum conditions 
amounted to the condition of single-valuedness for the function e i2n ^ h . 

In the case of aperiodic motions, starting at infinity and ending at 
infinity, the action function S remains single-valued, so that quantum 
restrictions of any kind are unnecessary. 

The coordinates of a particle describing such an aperiodic motion, 
considered as functions of the time t , cannot, of course, be expanded 
in a Fourier series. The latter can be replaced, however, in this case 
by a Fourier integral. Limiting ourselves, for the sake of simplicity, 
to motion in one dimension, e.g. parallel to the #-axis, we can write 
instead of (79), § 12, 4oc 

x(t) = j x°(v)e i27n ’ 1 dv, ( 106 ) 

— 00 

and instead of (81 b) +00 

= J arJv dv", (106 a) 

— 00 

where a:J v = x^(v”~v), the product zj} v dv” replacing the amplitude 
x mn ; v = W'/h is the frequency associated with the energy W = W', 
which is supposed to be the energy of the motion represented by (106a). 
As to the frequency v” = v'+v, it is natural to assume that it coincides 
approximately with W"/h> where W” denotes the energy of a state, a 
transition from which to the state W' corresponds, with regard to fre¬ 
quency and intensity of the emitted light, to the element 
of the integral (106 a). The question of the degree of approximation 
between v* and W*jh (if v = W'/h) has no definite meaning in the 
present case with a continuously variable W , for equations (80), (80 a), 
(80 b), and-(80 c) cannot be applied to it, the integrals § referring to 
‘round trips’ only. We are therefore entitled to assume that v” coincides 
exactly with W”/h, i.e. that there is not only a ‘correspondence’ but 
an actual identity between the classical frequencies occurring in (103) 
and the quantum frequencies (W”-~ W f )/h. The responsibility for the 
disagreement between the classical and the quantum theory can thus 
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be shifted entirely on to the amplitude coefficients a;[I v , which can be 
supposed to ‘correspond’, i.e. to be approximately equal to the matrix 
elements of x with regard to the states W' and W" 

-t-oo 

J XI //“-* </>'». dx. 

'00 

The correspondence with these elements can actually be established 
with the help of the approximate expressions of the wave functions 
in a way similar to that used in § 12 for the case of a discrete spectrum. 
We shall put accordingly 

I/v = (107) 

\V y > 

where the coefficients C v > must be determined by the condition 

4 a© 

| dx = S(v’—v"). (107 a) 

— oo 

Taking into account the relation 

8"'(x)—s° v *(x) (W v .—W v .)t, (107 b) 

which can easily be shown to hold approximately (for two states not 
far removed from each other) irrespective of the periodic or aperiodic 
character of the motion,! we get in the case of neighbouring values of 
v and v": 

4 oo -(-oo 

= J F(x)tft dx s 4C V 7C V :. J F(t)e-‘W“>-KM dt. (108) 
— 00 —00 

On the other hand, the Fourier coefficients in the integral representing 
a function F v >(t)\ +0O 

F v .(l) = J 

— oo 

are determined by the formula 

4« 

F° v . y . = J F,.(t)e-^’-^dt, (108a) 


which coincides with the’ preceding expression for F vW if we put 

t Cf. § 12. Since in the present case the integral J -- §9 dx is non-existent, we can 
piit directly 

O W 

flgO 

We have further, from the definition — = g *= <J[2m(W— C7)], 


in the same way as before.—The relation (107 b) can be proved in a somewhat more 
complicated manner for the general case of a (non-periodic) three-dimensional motion. 
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§ 14 


v'—v -- (W W')jh and C% = 1 . The latter condition can easily be 
shown to follow from (107 a). In fact the main contribution to the 
integral (107 a) must be due to distant points where the functions <s°(;z) 
reduce to gx with a constant value of g (corresponding to a constant 
potential energy). Replacing g by hk, where k is the wave number, we 
get, according to (23 a), Chap. I, 


whence 



vtcyc,-) 

V(*vv) 


+ 00 

/ 


0 O 


e i2Tt{k’-k*)x d x 


V(«vv) 


8(k'—k ") — 8(v'—v"), 


+ 00 

dv" - 

*VV/ 

or, since J 8(k'—k H ) dk” = 1, 

Cldv^ i 
v v dk 

Taking into account the relation v — hk 2 /(2m), we get 
dvjdk = hkjm = 

(group velocity = corpuscular velocity) and consequently 

= 1. 

The integral (108) expressing the Fourier components of a function 
F v .(t) converges and has a definite value only when this function 
vanishes for t = ±oo. This condition is not satisfied for most of the 

quantities referring to aperiodic motion. In the simplest case of uniform 

+ 00 

motion we have, for instance, x = vt and x^ v > = v v > J ie ^ the 

— oc 

integral obviously diverging. If, further, F denotes a constant of the 
motion—e.g. the energy H —we get 

+«c 

H% = W„, j e-w-^dt = W v .h(v’-v'), 



UWM* k '- k ' )dk '= 1 ' 


in exact agreement with the result (101b) obtained from the matrix 
definition of Z/Jv- 

These considerations give a new explanation of the fact, already 
mentioned, that the matrix elements of various quantities in the 
case of a continuous energy spectrum do not in general have definite 
values, being expressed by non-converging integrals over oscillatory 
functions of the e i2irkx type. 



IV 

TRANSFORMATION THEORY 

15. Restricted Transformation Theory; Matrices defined from 
different ‘Points of View’ 

Let us consider two operators II and K which we shall assume to 
represent the energy of the same particle moving in different fields of 
force with the potential-energy functions U(x,y,z) and V(x,y,z), both 
being independent of the time and limiting its movement classically to 
a finite region. 

The characteristic values of H, which in this case will form a discrete 
set, will be denoted by H' or II", etc. (the dashed letters referring not 
to a particular characteristic value, but to any one of them). The 
corresponding characteristic functions will be denoted by 

*l>n = t°H'( x >y> z ) e ~ i2ninih > 

etc. A similar notation will be used for the characteristic values K' 
and functions </> K > = <f>^>(x,y,z)e- i2rrK ' lth of the operator K. 

If there is no degeneracy, the functions \ft H . will be completely speci¬ 
fied by the attached value of the operator to which they belong. In 
case of degeneracy we must add to the energy operator one or two 
other operators, representing independent constants of the motion, for 
example the z-component of the angular momentum M z and its square 
M 2 if the potential energy U depends upon the distance r alone (central 
field of force). To avoid unnecessary complication, we shall in such 
cases understand by H the set of all these three mutually commutable 
operators H v # 2 , 7/ 3 , and by IV a set of their characteristic values 
H [, H 2 , corresponding to the same function tf/ ir = (in 

the sense of the simultaneous validity of all the three equations 
H l i/j h > = H[ifj H ,, H 2 ifj }r = = IV 3 *ft H > which we shall write 

as a single equation Hip lr = H'i/j Tr ). The same remark applies to the 
operator K y its characteristic values K\ and its characteristic func¬ 
tions <f> K ', 

In addition, let us consider some quantity represented by an operator 
F and let us introduce its matrix representation with the help of the 
functions \{j h > on the one hand and of the functions <f> K > on the other. 
We shall thus get two different matrices which we shall denote by 
F h and F k respectively and refer to as the matrix of F ‘from the 
point of view’ of H and the matrix of F from the point of view of K. 
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The components (or elements) of these matrices will be denoted by 
F H ’h(Fjj'h 0 ) and Fr'k* (Fk'k*)- We shall thus have 


F,rH- = j K Wh- dV, 
Fkk- = l&FtK-dV, 

with 

F°h- h - = jMFtfrdV 
Ftrr = lfl?FtirdV 

j (109) 

f H h- = 

F kk . = F* K . ir e n * K ’- K ’M. (109a) 

In particular we shall have 




Kick* — K'$ K 'k*> 

(109 b) 

since H and K are diagonal matrices from their own point of view, 
the elements of these matrices being identical with the respective 
characteristic values. 

The transformation theory in its simplest form consists in the estab- 


lishment of a certain connexion between the two ‘points of view’, i.e. of 
certain relations between the functions i/j ir and the functions tf> IC , as 
well as between the matrices F„ and F k . With the help of equations 
(109), the second part of this problem can be reduced to the first. 
However, we shall see later that it can be solved independently without 
the use of the functions 0 and </>, on the basis of the conditions (109 b). 

The fundamental assumption of the transformation theory is that 
the amplitude functions 4>%\x,y,z) can be expressed as linear combina¬ 
tions of the amplitude functions \fj {) H \x,y,z) according to the equation 

<f>K' = 2 a H'K’ 1 Ph' ( 110 ) 

with constant coefficients a H K >. We shall not try to justify this assump¬ 
tion on formal grounds for the general case of any operators H and K 
but shall be content with the following remarks. 

(a) The assumption (110) leads to an unambiguous determination 
of the expansion coefficients a H . K .. Indeed, multiplying (110) by 
and supposing the different functions \p H > to be orthogonal to each 
other (which we can always do), we get upon integration 

a H*K' " J 'Ph’^k' dV. (110 a) 

It is clear from this that equation (110) can hold only when the sum¬ 
mation is extended over all the values of H ', i.e. over all the stationary 
states, defined by the operator H (and those representing other in¬ 
dependent constants of the motion, if there is degeneracy). 

(b) For our assumption to be justified it is necessary and sufficient 
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that the series (110) with the coefficients determined according to 
(110 a) should be convergent. 

Wo shall argue in future as if this convergence condition were 
satisfied. It can be shown to be actually satisfied in most cases of 
practical importance corresponding to a small difference between K 
and // due to some weak ‘perturbing’ forces. In this particular case 
the transformation theory we are developing reduces to the so-called 
perturbation theory. 

If the transformation (110) holds, then the reciprocal transformation 

tir ~ 2 a K'H <f>K' ( 11 1 ) 

must also hold with the coefficients 

*2ir = l4®PH-dV. (in a) 

Comparing this with (110 a), we get the relation 

a* rK , - a K \ H , (112) 


On substituting the expressions (111) in (110) or (110) in (111), we 
get—in the first case— 



a K~ir ft* — (2 a K'fv a R K') < l >( ir> 


i.e. 2 

IV 

a EriV a lVK’ — 

(112a) 

and in the second case 

l 

a irK ,a K'ir = ^irw 

(112b) 

Replacing a K ', }r by aJ rjr 

according to (112), wc obtain the relations 

I 

a IVK.' a *rK m — ^KK” 

(113) 

I 

K' 


(113a) 


which express the orthogonality arid normalization of the coefficients 
a n . K . (or a K ) ir ). 

Another—equivalent—form of these relations is obtained by multi¬ 
plying in (110) by its conjugate complex and summing over K\ 
= ie. according to 

H'fTK' 

(H3b) 

K' IV 

Before proceeding further in the formal development of the theory, 
we shall examine the physical meaning of the assumption implied by 
the transformation equations (110) and (111). 

It should be noticed first of all that the latter have an external 

3695.6 


This gives 
(112c), 


s 
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resemblance to the representation of the general solution of the wave 
equation [h-\- = 0 in the form of a sum of its particular solu- 

\ 2iri ct/ 

tions, i.e. to the equation 

= | C h 4 h , = | Cn-firt-**™. (113c) 

The fundamental difference between the two cases is that the time t 
enters as an essential factor in equation (113 c), while the transformation 
equations (110) or (111) do not contain it at all. If, however, we put 
in (113 c) / = 0 or t = t 0 , i.e. consider the function ^ at a definite instant 
of time, we see that by a suitable choico of the amplitude coefficients 
C H r it can be made to coincide with any one of the amplitude functions 
so far as the latter are actually expressible by a series of the type 
(110). The physical meaning of the assumption implied in formula (110) 
is that any stationary state defined by the operator K , according to 
the equation K<f> K . == K f <f> K > can be represented as a superposition of the 
alternative states defined by the operator H (according to = H'i/j h >) 

at a certain instant of time. Such a coincidence, even if achieved at 
a definite instant t = t 0 , will, however, not persist unless the coefficients 
c B . are allowed to vary with the time in an adequate manner. In this 
case the function 0 defined by (113) will no longer represent a general 

solution of the equation + — 0; it seems, however, natural 

to suppose that, with a suitable definition of the functions 
it will represent the general or a particular solution of the equation 

The latter assumption reduces to the equation 


$K‘ — Ch'k W'Ph'* 


(113d) 


<f)° K , e-i^K'tlk ^ ^ < 




which becomes identical with (110) if we put 

= fljrjr (113 e) 

In the same way we can replace the equations of the reciprocal trans¬ 
formation (111) by . /*\j. ni A \ 

with Ck'h' — a K* w ei2n{ (114a) 

We thus see that our fundamental assumption as to the existence of 
a linear relation (110) or (111) between the amplitude functions (f)° K > and 
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is equivalent to the assumption that the same motion, whether it 
be determined by an energy operator H or K, can be described from 
the point of view of the other operator, in the sense that a stationary 
state of the set determined by K (or H) can be represented as a super¬ 
position of stationary states determined by H (or K) with variable 
amplitude coefficients C JVK > (or 

If the latter were constant, then (113 c) would represent some general 

( h> d\ 

II -(-^ t ~\ip = 0 corresponding to the pos¬ 
sibility of finding the particle in one of the alternative (mutually 
excluding) states of motion defined by the different functions 
The coefficients C H K ^ provided they satisfy the normalizing relation 
^ \^H'K’\ 2 = 1> would in this case represent the ‘probability ampli¬ 
tudes’ of the different alternative states i p H >, the probability of these 
states being equal to the square of the moduli of C II'K'- 

11 is natural to preserve this interpretation in the present case when 
the C irK > are functions of the time defined by (113 b). This dependence 
upon the time does not affect their moduli, which remain constant and 
equal to the moduli of the transformation coefficients a }VK ,— the nor¬ 
malization condition £ I ^hk'\ 2 = 1 being satisfied in virtue of the 
relations (113) (with K” = K '), 

In defining the quantities !<W I 2 or |# H 'jg:'l 2 as the probabilities of 
the different states of the H- set, we must not forget that all these states 
are associated with a definite if-state, as indicated by the second sub¬ 
script in a H > K > . The quantity \a H ' K >\ 2 is not to be regarded as the 
probability of the state H’ per se irrespective of any accessory con¬ 
ditions—for such unconditioned probability has no definite value—but 
as the probability of the state H' subject to the accessory condition 
that the particle is actually in a state of motion specified by value K' 
of K or by the function 

Instead of talking of the states as described by the wave functions 
<f> K > or it is often more convenient to speak of the values of certain 
quantities F, II, K associated with these states. The fact that a definite 
state is actually realized can be expressed by saying that the probability 
of this state is equal to unity. We can thus say that \a H > K >\ 2 is the 
probability that the quantity II has the value //' if it is knoum (with 
a probability amounting to certainty, i.e. equal to unity) that the 
quantity K has the value K\ 

It is perfectly natural that the determination of the probability of 
a certain value of some quantity, e.g. H , must imply an assumption 
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about the probability of a given value of some other quantity K —for 
the probability theory does not create probabilities, but only correlates 
them. 


From the relations (112), it follows that \a H > K >\ 2 = |I 2 - This 
equation can be interpreted from the probability point of view as the 
expression of the ‘reciprocity law’, which means that the probability 
of H having the value H' when K is known to have the value K' is 
equal to the probability of K having the value K' when H is known 
to have the value H'. 


This feature of the coefficients a JVK > reveals a close similarity between 
them and the amplitude functions \fj% ( or As a matter of fact, 

the latter also depend upon two arguments, or sets of arguments—one 
of them, x,y, z , specifying the position and the other, H' (or H' u II H g), 
the energy and some quantities commuting with it (i.e. representing 
constants of the motion defined by the energy operator H). Further, 
the function 1 4^H'(x 9 y,z)\ 2 t or more exactly its product with the volume- 
element dV, does not determine the probability of a position specified 
by dV irrespective of any other circumstances, but subject to the 
explicitly stated condition that II is known to have the value H\ To 
give an adequate formal expression to this analogy between the coeffi¬ 
cients cLk'h’ on the one hand, and the functions iffj r (x,y,z) y 

<f>K’(x, y , z) on the other, we shall introduce for the latter the following 


notation: 


y ’>*') = '/'V. <&-(*'> y’, *') 


using x ' to represent a set of values of the three coordinates x, y , z in 
the same way as H' or K' is used to represent a set of values of the 
three quantities H ly H 2 , H z or K v K 2 , K z . 

The analogy between the functions and the coefficients a U ' K > 
or a^ fr seems to indicate that a set of values of the coordinates x (x, y , z) 
can specify a ‘state’ of the particle just as well as a set of characteristic 
values of any other three mutually commuting operators H v H z> H z or 
K v K 2 , K z . We are thus led, in a very natural manner, to revise the 
conception of a ‘state’ or ‘stationary state’ which we have been using 
hitherto, in the sense that it is not determined by a function or 
4&k’> refers to two states of two different sets like the trans¬ 

formation coefficients—or probability amplitudes—and but 
simply by the values of three quantities (corresponding to the three 
degrees qf freedom) which are represented by three independent mutually 
commuting operators such as the three spatial coordinates of the particle, 
or its energy, 2-component of the angular momentum, and square of the 
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latter (in the case of a motion in a central field of force), and so on. 
A ‘state’ defined in this more general way must no longer be necessarily 
associated with the idea of motion. As a matter of fact the idea of 
motion—in the sense of a change of the position with the time—has no 
meaning in wave mechanics, being replaced by the idea of the proba¬ 
bility of finding the particle in a given position when its energy and 
two other quantities commuting with the energy have, given values. 
The functions do not have to be associated with motion any more 
than the coefficients They are to be interpreted simply as the 

probability amplitudes for a state defined by the position x' (or volume- 
element dV') subject to the condition that H = H\ just as the coeffi¬ 
cients o^ H > determine the probability of the value K' of K if H is 
known to have the value II'. 

It should be remarked that in all these considerations the time does 
not play any role whatever so long as it does not appear explicitly in 
II or in the other operators concerned. 

We are thus driven by the inner logic of the ideas embodied in the 
wave-mechanical theory to consider it as a special case of a general 
physical theory—let us call it quantum mechanics—whose problem 
consists in determining the probability of a certain value of some 
quantity or of a set of quantities when a set of some other quan¬ 
tities is assumed to have given values. This general problem reduces to 
the usual wave-mechanical problem when the first three quantities 
are the coordinates of the particle, and the second three are its energy 
and some other two quantities which are represented by operators 
commuting with the energy operator. 

The condition that the three quantities of each set—those whose 
values are supposed to be known or those for which the probability of 
certain values is being determined—should be represented by mutually 
commuting operators seems to be essential for the problem to have a 
physical meaning. It is customary to express the possibility of fixing 
simultaneously the value of two or more quantities by saying that they 
can be simultaneously observed or measured ; this can be regarded as 
the experimental equivalent for the mathematical idea of ‘mutual eom- 
mutability’, connected with the operator or the matrix representation 
of the quantity in question. I should like, however, to warn the reader 
against the conclusion, often implied in the above expression, that in 
discussing elementary phenomena, we must keep in mind the observer 
or experimenter as an essential part of these phenomena, supposed to 
be responsible through his interference with them for the indeterminate- 
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ness by which they are characterized—and which, as a matter of fact, 
is only revealed and not produced by his observations. 

This indeterminateness constitutes the characteristic feature of the 
new quantum or wave mechanics, which distinguishes it from classical 
mechanics. In the case of a particle moving in a given field of force 
with three degrees of freedom, the classical mechanics assumed the 
possibility of fixing simultaneously the values of six quantities—for 
instance, the three coordinates x, ?/, 2 and the three components of the 
momentum g xr g y , g z (or the energy H, the z-component of the angular 
momentum M s , and the square of the latter M 2 ), whereby the motion 
was completely determined—while the wave or quantum mechanics is 
less ambitious and restricts the number of quantities whose values can 
be fixed (arbitrarily, or by observation) to three , making up for the result¬ 
ing incompleteness or indeterminateness in the description of the motion 
by probability considerations as to some other set of three quantities. 

Another distinction between classical and quantum mechanics which 
must be borne in mind refers to the role played by the time. In the 
former case this role seems to be much more fundamental and important 
than in the second. As a matter of fact, the time seems to have been 
completely eliminated from the scope of the quantum mechanics as it 
has been specified above. This is, however, not quite true. First of all 
the time enters implicitly in the definition of such quantities as the 
components of velocity (or momentum) and various functions of them 
(such as energy, etc.), although these quantities are represented by 
operators which do not contain the time explicitly. And secondly we 
have supposed from the very beginning of this section that the potential 
energy of the field of force in which the particle is supposed to move 
does not contain the time explicitly, i.e. it depends upon the coordinates 
alone. It is only subject to this condition that the time can be practically 
eliminated from the theory; it becomes, however, a vital element of 
the latter when the potential energy is e function not only of the 
coordinates but also of the time. In this case SchrOdinger’s equation 

( h 3 \ 

//_|-- —= 0 does not have particular solutions of the form 

2m dt] 

ip = w jth z) satisfying the equation • 

Characteristic values of the energy do not exist, or putting it in another 
way, values of the energy, if it is not a constant of the motion, cannot 
be measured, and the question of determining the probability of an 
arbitrarily chosen position x’{x\y\z') for a given (supposedly known) 
value of the energy becomes meaningless. 
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We shall now come back to our original assumption, that neither II nor 
K contain the time explicitly and that they possess a discrete set of 
characteristic values and K'(K[, K! z , K^) which determine 

two discrete sets of ‘states’. We have been led to the conclusion that the 
coordinates of the particle can be used for the definition of a third set of 
states, specified merely by the position of the particle in space. Since any 
values of the coordinates x'(x', y r , z') are possible, these values can be re¬ 
garded as constituting a ‘continuous spectrum ’. This distinction between 
H and K on the one hand, and x on the other hand is reflected in the fact 
that in determining the probabilities we must speak of definite values of H 
and K and of a definite range of the values of x, i.e. of a volume-element 
dV in which the particle is supposed to be situated. We thus have the 
expressions: \a IVK | 2 for the probability of II ~ IV if it is known that 
K =. K\ or of K = X' if it is known that H — IV; dV' for the 

probability that x is enclosed in the range (x\x f dz’) if it is known 
that H = II’ (dV' — dx'dy'dz'); j<^J^| 2 dV' for the probability that x is 
enclosed in the range (x',x'+dx f ) if it is known that K — K\ 

Generalizing the reciprocity law which has been established in the 
case of \a ir . K '\ 2 , we can define |#J 7r | 2 dV' and \4>%k'\* dV* as the proba¬ 
bilities of II = IV or K = K' when it is known that the particle is 
located in the volume-element d V'. 

The similarity between the functions if^ jr or <fy K , and the coefficients 
a K'n f or a irK' revealed also by the fact that they satisfy similar 
orthogonality and normalizing relations, which in the former case are 
expressed either by means of integrals (over x’) instead of sums (over 
H' or K') or by functions &(x'—x") instead of h H > ir or —corre¬ 
sponding to the fact that H' and K' form a discrete and x f acontinuous 
set of values. We have, namely, the relation (113 a), which can be 


written in the form 


a K‘ir a K‘H m 


and to which there correspond the usual orthogonality and normalizing 
relations for the ‘wave function’ \p 

J dx 1 = 8 n . ir (dx l = dV'). (116) 

Besides the preceding relation, the coefficients a K ) H > also satisfy the 
‘reciprocal’ relation (113) or 

a Icir a K'*r = 

to which an analogue is found in the relation 


(116a) 



136 TRANSFORMATION THEORY §15 

where S(a;'— x") is an abbreviation for the product of the three Dirac 
functions 8(x'-—x"), 8(z'~z") (just as 8 KK » is actually an 

abbreviation for the product of the three expressions of this type for 
the three quantities implied in K). 

The proof of the relations (116 a) [i.e. of their equivalence to (116)] 
is obtained by multiplying them by where IV is any fixed value 

of II, and integrating over x This gives, in view of (116), 

J1 4>l;i4WW,r dx" =J Psir J Vsjr'ft'w dx" = H n; 

which, according to the definition of the function 8(x"—x'), agrees with 
J 4’x-ir 8(x"—x f ) dx". The remaining difference between the probability 
amplitudes a /ri0 <f>y K ’ vanishes if we abandon our initial assump¬ 
tion as to the discreteness of the spectrum of II and K and suppose 
that one of these quantities, e.g. II, has a continuous spectrum, being 
in this respect equivalent to x (the spectrum of K will be assumed for 
a while to remain discrete). 

The transformation equations (110) which, with our new notation, 
could be written in the form 

$ x’K ' == 2 $e , n' a n’K’> 

//' 

must now be replaced byf 

fix'K' — J 'Px'H' a H’K' dH - ( 1 1 ') 

Multiplying this equation by ^2*/" and integrating over x'(x',y',z f ) 
(dx' = dV'), we get 

f &H-& A” dx’ = j a a . K . AH' J flW'ir dx' = J a irK .S(H'-H") dH\ 

that is a ii'K‘ — j" 4>Tii"4 , x-k' dx' 

as before. J Since the form of the reciprocal transformation 

$£'//' “ 2 a K’H' &x’K ' (117a) 

K‘ 

remains unchanged (so long as K is supposed to have a discrete spec¬ 
trum), we get the previous relation between the coefficients a and or 1 , 
namely, a^ ir = a* rK >, leading to the reciprocity law \a^ lv I 2 — \ a irK'\ 2 - 

f This transition is quite similar to a transition from a Fourier series to a Fourier 
integral, which as a matter of fact forms a special case of the transformation or ‘expan¬ 
sion’ (117) and (117 a). 

% It should be noticed that the former coefficient clb'K' actually corresponds to the 
product of the present coefficient with dH\ this difference being compensated for by 
the difference between the previous and the present form of the orthogonality and 
normalizing relation for ipn’- 
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Substituting the preceding expression in (117 a), we get 

K' ~ X 4%'k * J a K*ir a u‘K' dll’, 

whence it follows that 

J a K"iL ,a irK’ dH r — 

or / *H' K -a irK -dH' = (118) 

This orthogonality-normalizing relation, which replaces (113), is 
identical with the corresponding relation for the function <^ A -, x' being 
replaced by K f [cf. (116)]. In a similar way (through substitution of 
(117 a) in the reciprocal expansion) we find the relation 

2 a JiK' a UK' — S(/f'—#"), (118 a) 

K' 

which is the complete analogue of (116 a) with x' replaced by //' and 
//' by K'. 

If both H and K have a continuous spectrum, the relations (118) and 
(118 a), as well as (116) and (116a), are replaced by relations of the form 

/ a nK’ a 2iK’ dll' •— h(K"—K'), 

J a *i'K ,a I rK' dK’ = S(//"—-//'), 

/ ^n-4'Tn-dx =h(H"-H’), 

dll'^b(x"~x% 

etc., all the sums being replaced by integrals and all the -numbers 
by ^K'—K") -functions. All the transformation or expansion formulae 
acquire in this case the same form (117 a). 

From the complete analogy between a HK > and or it follows 
in particular that we must have, in addition to the equations 

^x’K’ = J t l'x , n' a irK' dH\ ^x'lr — J ^x’K^K'ir dK\ (119) 

the equation a H . K , = J dx\ (119a) 

where — *($ hI n fact, this equation is nothing else but the 

expression (110 a) for the coefficients a IVK >. Wc can thus consider this 
equation as a ‘transformation’ between the functions a irK > and <!>%'&> 
playing the role of the transformation coefficients, or as a trans¬ 
formation between the functions a irK > and the role of the 

transformation coefficients being played in this case by 

It should be mentioned that (119 a) still holds when U and K have 

3595*6 T 
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discrete spectra, equation (110) being replaced by (117) and its re¬ 
ciprocal ,/.o — v ao n~\ (119b) 


0 X'H ' “ fijr'K' a K'Il'‘ 


After we have thus settled the physical meaning of the ‘transforma¬ 
tion coefficients’ or ‘wave functions’ as the probability amplitudes for 
the values of one of the quantities concerned when the value of the 
other is supposed to be fixed, we obtain an extremely simple and 
illuminating interpretation of the various ‘transformation equations’ 
connecting these probability amplitudes. All these equations can be 
considered, namely, as the expression or rather the direct consequence of 
the addition and multiplication law of the new probability theory (which 
deals with the probability amplitudes in the same way as the old theory 
dealt with the probabilities themselves). 

Taking the last equation, for example, we see that the product 
4>x’K ,a K'ir can he interpreted as the probability amplitude that x will 
be equal to x' if K — K' and that at the same time K will have the 
value K ' if H is known to be equal to IF. Keeping the latter value 
as well as that of x fixed, and summing the products f° r 

all possible values of K, we must obviously obtain the probability 
amplitude of x — x’ subject to the assumption that H — IF, in agree¬ 
ment with (119b). 


16. Transformation of Matrices 

We shall now return to the beginning of the preceding section, i.e. we 
shall again assume the values of H and K to be discrete, and we shall 
examine the transformation equations for the matrices representing 
different quantities F from the point of view of H and K. Before doing 
this we must point out the fact that the transformation coefficients 
a H K > and a^ fr can also be considered as the matrix elements of a cer¬ 
tain matrix a and its reciprocal or 1 respectively, in the same way as 
F H rr or F kk » are the matrix elements of F H or F K . The main 
difference between them is that, in the latter case, the two indices 
(//', H* or K', K") refer to states of the same set , defined either by H or 
by K , whereas in the former case the first index refers to a state of the 
one set and the second to a state of the other set. 

Another difference (closely related to the preceding one) is that while 
the matrix elements Fkk- or F irfr are Hermitian, i.e. satisfy the 
conditions F K . K . = F^ K >, F H ’h- — Fh*h ■» the coefficients (or matrix 
elements) a H > K > are not Hermitian, as shown by the relations (112). 

The matrix which is obtained from F (or a) by interchanging the 
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rows and the columns is called the transposed matrix of F and is 
denoted (usually) by F. A matrix F* which is obtained from the 
transposed F by taking the conjugate complex of its elements is called, 
according to Jordan, the ‘adjoint’ matrix of F (‘conjugate imaginary’ 
according to Dirac) and denoted by F\ Using this notation, we can 
write the Hermitian condition in the form 


F f = F, (120) 

while the condition (112) can be written in the form 


a T — a~ 


(120a) 

Matrices a satisfying this condition are called ‘unitary’, because the 
product of such a matrix with its adjoint matrix, which is the analogue 
of the square of the modulus of an ordinary complex number, is equal 
to unity (i.e. to the unit matrix). 

It is self-evident that the multiplication of the matrices of the 
type a which do not correspond to a definite ‘point of view’ (// or K) 
but serve to connect tw-o different points of view must be performed 
according to the usual rule of matrix multiplication, i.e. by com¬ 
bining the rows of the first factor with the columns of the second. This 
means that the elements of the product of two matrices a and b must 

have the form , ^ . 

(<*t> )mn = ^ a mkKo 


i.e. that the second index of the elements of the first factor should 
coincide with the first index of the elements of the second factor, this 
common index being the index of summation. 

From the point of view of this definition, the product of a ‘mixed’ 
matrix such as a by itself or its conjugate complex a* would have 
no meaning, since the two indices refer to states of different sets, and 
therefore cannot be identified. We can, however, form the product of 
a with its transposed (a) or adjoint matrix (a f ), since the first index 
of the latter two refers to a state of the same set as the second index of 
the former and vice versa. The expression y t dirK , ^K , H‘' can thus 

K 1 

considered as the ( H',H”) element of the product matrix aa which is 
of the same ‘pure’ type as the matrix F n . The same refers to the 
matrix aa t or an- 1 , if the elements of the reciprocal matrix a- 1 are 
labelled with the indices H' and K e in the order opposite to that which 
refers to the matrix a (as has actually been done in the preceding 
section). It can easily be shown that the matrix aa f is Hermitian (while 
da is not). In fact, taking its adjoint matrix, which is obviously equal 
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to the product of the adjoint matrices of the two factors taken in the 
reverse order, we get 

(aa^y — a rt a f — aa\ 

in agreement with (120). 

It should be noticed that the two matrices aa 1 and a*a are, in general, 
entirely different, the former belonging to the same type as K and 
the latter belonging to the same type as F K . 

In the particular case of a wiilary matrix, satisfying the conditions 
(120 a), we get 

~ 2 a ^K'H' a H'K” ~ 2 a irFr a Ji'K" ™ 

( aa ^)irjr ~^r a Ji'K ,a K'H' = ^ a irK' a jc/r ~ ^//vr> 
according to (112a)~(l 13 a), or in matrix notation 

= 8 /; , — S A , (120 b) 

where S H and denote the ‘unit matrix’ as defined from the ‘point 
of view’ of H or K. Neglecting the physical meaning implied in this 
difference one often identifies the two unit matrices and writes 

aa f a 1 a = 1, 

which occasionally can lead to misunderstandings. 

The possibility of treating the transformation coefficients as the 
elements of a (mixed) matrix and of applying to the latter the usual 
rule of matrix multiplication is substantiated by the results obtained 
in two or more successive transformations. Let L be an operator (or 
set of three operators L v L 2 , L z ) of the same kind as // or K, with 
the (discrete) characteristic values U and characteristic functions x°j /* 
These functions can be ‘transformed’ to those of K by means of the 

equations x°r — 2 an d further to those of H by means of 

W 

the equations = Y a H K > ify*. Combining them together, we obtain 

K' 

a direct transformation from L to IJ, 



with the coefficients c WT/ ~ The ma trix °f these coeffi¬ 

cients is thus equal to the product of the matrices a and b taken in the 
order stated, and calculated according to the ordinary rule. Using the 
matrix representation for the transformation coefficients, we can thus 
define the matrix of two successive transformations as the product of 
the matriees of each of the separate transformations. This holds, in 
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particular, for the case which has been considered above, where the 
second transformation is the reciprocal of the first one. 

We can now turn to the main object of this section—the transforma¬ 
tion of the matrix representing the same quantity F in the transition 
from one 'point of view’ specified by H to another, specified by K. 
Substituting (110) in the expression (109) for the elements of F K , we get 

h ^ a jl’K' a ll”K'FII'11”' 0^1) 

IT ir 

which can be written in the form 

Fk'K" ~~ ^ 2 a K lV ^//7 r a ITK*' 

Jr ir 

This expression can be interpreted, according to the matrix multi¬ 
plication law, as the (K\ K ")-element of the product of the matrices 
Fa, and a taken in the order stated. We can thus put 

F k - Fjj a. (121a) 

Substituting (111) in (100), we get in the same way 

F If = a F k a 1 . (121b) 

This equation can be obtained from the preceding equation if the latter 
is multiplied by a on the left and by al on the right side and if the 
relations a*a — aal =-- 1 are taken into account. 

If we restrict ourselves to multiplying (121a) by a on the left or by 
a* on the right, we get 

F n a == aFj- \ (121c) 

and F K a*--(FF ff I 

The product matrices in these equations have all a mixed character, 
with elements of the type (//', K') in the case of the first and (A',//') 
in that of the second. 

Written in matrix elements, these equations run 

{F H a) H . K . ~ 2 F jvir a jrK . — ^a 2rK -F K - K . = (dF K ) }rK >, 

H m Jr 

{F K (i^) K > Tr — 2 Fjck*iv ~ 2 a K ir F/nr ~ ( at ^//)/v7r* 

If in (121 c) we put, in particular, F = K or F = //, we get 

K N a^-aK Kt aH K — H }] a, (122) 

and two similar equations with instead of a. 

Taking the element (//', K') of the first equation (122), we get, since 

k kk" — 

2 l r a H ”K' — A a H K - 


(122a) 
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In the same way we obtain from the second equation (122) 

2 a H'K” Hr'K’ — a H’K' H ' (122 b) 

A" 

The equations (122 a) have exactly the same form for all values of 
K f . Dropping K' as second index in the coefficients a, we can rewrite 
them as a single system of linear homogeneous equations (corresponding 
to different values of H’) for a set of variables a IV 

2 Kji'ji'&n* = K’a H > (123) 

H* 

with a parameter K'. 

This system of equations can serve for the direct determination both of 
the transformation coefficients a H K . and, of the values K' if the matrix 
elements of K H are known. We have, indeed, as the condition of the 
compatibility of equations (123) the vanishing of the determinant, 

I K H 'h — K’ K Jijr Kuir" 


K iru . K jrfr —K' ^ h" iv" 

Kivu' K H "h* ^ 


= 0, (123a) 


which is an equation for the determination of the possible values of 
K ' (K'\ K"\ etc.). To each of these values there corresponds a set of 
values of the variables a ir which we can identify, under certain con¬ 
ditions, with the transformation coefficients a H , K > (a irK >, a H " K ', etc.). 
These conditions amount to the relations a*a = aa f — 1, which can be 
shown to be verified if the solutions of (123) are normalized according 

to the equation ^ , /lrtol v 

^ a H' a w “ 1 (123 b) 

for every value of K'. 

Let us first of all make sure of the fact that the values K' obtained 
from (123 a) are real. To show this we take the equations 

2 Kjj’h* a h-k ' — K'aji'K', 

= K'* a HK’ 

(the first of which can be considered as an identity, resulting from (123) 
for a particular value of K\ and the second as its conjugate complex), 
multiply them respectively by af{> K > and a irK ■, sum over //', and finally 
subtract one from the other. This gives 

“ KH'H 0a H”K’ a H' K ' ““ Ku' H na *H'K ,a H'K'- 
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'Taking into account the Hermitian condition KJ rjr K }r}r , we 
can rewrite the second double sum on the right side in the form 
1 2 Kfru' a H'K' a jrK' which becomes identical with the first double 

fV H" 

sum if we interchange the summation indices //' and 11”. We thus get 




0 . 


or, since the sum 2 « /r/r a 
tv 


* 

//'Iv' 


2 l a i/K'l 2 i s essentially positive, 
iv 


K'-K'* 


This equation expresses the fact that K' is real. 

If, in the preceding argument, we replace the second equation by an 
equation (identity) 

2 &*rir a *rK" = ^ ' a *vK" 


ir 

corresponding to some value of K" different from K\ multiply it by 
a h k', sum over H > and subtract from the first equation multiplied 
by a*, K . and also summed over //', we get 

(K'—K") ]£ a H'K a *I K' 

= 2 2 &ivn" a u’K ,a *rK" ~~ 5 Z ^ *nr a *ri£" n H'K'■ 
iv ir iv il¬ 

ia view of K) iir — K 1VIV and the interchangeability of the summa¬ 
tion indices 11 ', II ”, the right side vanishes just as in the case K' = K ”, 

and we get (A "-K") % a lrK .a*. K . = 0, 

tv 


which, since K'—K " is assumed to be different from zero, reduces to 


2 a H'K' a H k * ~ 0 

or 2 «//'*' — 0 ^ 

This relation expresses the mutual ‘orthogonality’ of the different sets 
of solutions of the system of equations (123). Together with the 
normalizing condition (123 b), it can be written in the form 

at a = 8 k , 

whereby the identity of the coefficients a irK . obtained from equations 
(123), (123 a), and (123 b), with those defined at the outset with the help 
of the wave functions j/^ r and by means of equations (110 a) and 
(111a), is demonstrated. 

At the same time we have demonstrated the possibility of effecting 
the transformation of the matrix F H representing an arbitrary physi¬ 
cal quantity F ‘from the point of view of W (i.e. with regard to states 
defined by H) to the matrix F K representing the same quantity 
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‘from the point of view of K 5 without the use of the wave functions 
characteristic of H and K, but by a purely matrix method, based upon 
the matrix representation of all quantities—including the key one K — 
‘from the point of view of H\ The transition from this point of view 
to that of K can be effected by means of the equations (123), (123 a), 
(123 b), which determine the transformation matrix a, and further by 
means of equation (121a), giving the new matrix elements of an/ 
quantity F in terms of the old matrix elements. 

In view of the relation a* = a~ l , this formula can also be written in 

the form F K = a-'F n a. (124) 


The transformation matrix a can actually be defined by the condition 


a~ 1 K n a = K k (a diagonal matrix) (124 a) 

which leads, after a left-handed multiplication by a, to the equation 
K h a = clK k , i.e. to the system of equations (123); the unitary character 
of the matrix o, expressed by the relation a y a = 1, can be considered 
as a consequence of these equations. 

A transformation of the type (124) is generally called a canonical matrix 
transformation. It has an interesting feature which does not depend 
upon a being a unitary matrix (i.e. satisfying the relation a f = cr 1 ), 
namely, of leaving invariant all the functional relations between the 
original matrices, the same functional relations holding between the 
transformed matrices. This can be proved directly by putting in (124) 
F = E+G or F — EG. In the first case we get, since f h = e h +g u , 
F k =,a- 1 (J E H +G H )a = a~ x E H a-\-a~ x G H a = E K G K ; 


in the second case we have, using (EG) H = E H G Hi 

Fg = a~ x Ejj Gji a. 

Now we can insert between E u and G H the product aa _1 , since it is 
equal to the unit matrix 8 whose product with any other matrix is 
identical with the latter (just as in the case of the multiplication of 
ordinary numbers by an ordinary unity). We thus get, by the asso- 
dative law, ^ = {a -i Eji a){a~'G u a) = E K G K . 

This proof can easily be extended by induction to any function F of 
E and G } so that, putting (in the operator representation) F = f(E, G), 
we have 


f(E K ,G K ) = a-'f(E H ,G B )a 


(124b) 


or a~ 1 f(E H , G H )a — f(a- 1 E H a, a~ l G u a). 

It follows from these equations that, in particular, the transformation 
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(124) does not affect the validity of the commutation relations between 
the coordinates and the components of the momentum; the original rela¬ 
tions {p x x—xp x ) H — J 1 . are transformed into (p x x—xp x ) K = $ K . 

Itti 2tti 

Canonical transformations of the above type should be distinguished 
from canonical transformations of the variables x, y, z, p x , p y , p z in the 
sense corresponding to the general definition of a canonical transforma¬ 
tion in classical mechanics (see § 5). In the former case the canonically 
conjugate variables are supposed to remain unaltered, the transforma¬ 
tion referring to the matrices only by which they are represented from 
the point of view of different energy operators (H or K). In the latter 
case, on the contrary, the variables p z are themselves transformed 
into a new set of canonically conjugate variables £, rj , f, rr^ rr , tt £, the 
energy operator H (x} ~~ H p c ) remaining essentially the same and only 
changing its external form because the old variables defining it are 
replaced by their expressions in terms of the new variables. We thus 
get for it a new function, say, of the variables £,..., n which is, 
however, numerically equal to H (x) for the corresponding values of the 
original variables. This numerical equality of the classical theory is 
replaced in quantum mechanics by the equality of the characteristic 
values of the operators and H ( $. The condition expressing the 
canonical character of the transformation from the original variables 
to the new ones consists in the fact that the matrices representing the 
latter (from any point of view) should satisfy the same commutation 
relations %= hSj27ri, etc., as those representing the old vari¬ 
ables. This means that the new matrices (of rr$ can be derived 

from the old ones (of x,..., pj by a canonical transformation in the first 
sense, i.e. in the sense of the equation (124). The physical meaning of 
such a transformation will, however, be entirely different from the case 
to which (124) refers, the two kinds of transformation bearing but a 
formal resemblance to each other.—We shall come back to the trans¬ 
formations of the second kind in the next section. 

In the case of a degeneracy of the original energy matrix H H , i.e. 
when some of its diagonal elements coincide, it is necessary to consider 
it simultaneously with one or two other matrices, which represent inde¬ 
pendent constants of the motion specified by H. We must therefore 
replace the operator H by the three operators H v H 2 , H 3 and define the 
matrix representation of any quantity F from the ‘point of view’ of this 
‘trio’, writing F H H H% instead of F H . The transformation matrix corre¬ 
sponding to a transition to the ‘point of view’ of some other trio, e.g. 

3595.6 TT 
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K v K 2i K z , will then be unambiguously determined by the simultaneous 
equations 

a ~ 1 K l = ^KK.K^K,) \ 

a ~ x ^2mjn % H % ) a ~ K*k x k % k x ) I (124 c) 

a ~ l Kz — ^aiK t K t K a ) ) 

with the condition that all the three matrices on the right side should 
be diagonal (which can always be satisfied if the corresponding operators 
K v K 2 , K z commute with each other). Each of the equations (124c), 
taken separately, will leave a certain amount of ambiguity in the shape 
of the matrix a, which can be removed by means of one or both of 
the others; if we do not desire a diagonal representation of the corre¬ 
sponding quantities we can remove this ambiguity in a perfectly 
arbitrary manner consistent with the condition ar 1 = a ). 

The preceding considerations can easily be generalized for the case 
when either or both of the operators (or the operator trios) H and K 
have a continuous spectrum. Let us assume, for instance, that the 
values of H form a continuous set, while those of K remain discrete. 
We then have, instead of (110) and (111), the transformation equa¬ 
tions (117) and (117 a) with a semi-continuous transformation matrix 
a H K' satisfying the orthogonality and normalizing relations (11S) and 
(118 a). The latter can be put in the same form, 
aa t —- 3^, cl^ ct — 

as in the discrete case, if S H is considered as a continuous unit matrix, 
i.e. as a Dirac function 

W = W-H"), 

while is the usual discrete unit matrix, and if, further, the matrix 
multiplication law is defined in the usual way corresponding to discrete 
matrices in the case of aa f : 

(aa^H'H 0 — ^ a H'K ,a K'H*> 

and in the way corresponding to continuous matrices in the case of a f a: 

( a ^ a )K'K* “ J a K , H' a u'K'’ dH' 

[cf. eq. (105 a), § 14]. 

We get further, instead of (121), 

F° k . k . = jj a*. K .a H . K . F%.„. dH'dH", 

Fk’K' — JJ a K’n' ^"h ! w a h’k’ dH'dH" y 


or 
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which, as in the discrete case, can be written in the matrix form 

F JC ~ a 1 F H a, 

it being understood that the matrix multiplication must be carried out 
according to the rule for continuous matrices whenever the ‘summa¬ 
tion’ indices are continuously variable. From this equation we can 
derive the equations (122), the second of which, when reduced to matrix 
elements, runs exactly as before [eq. (122 b)], while the first assumes 

the form r 

J K° H . H .a H . K .dir = K’a irK ., 

instead of (122 a). Dropping the index K ' of the coefficients a jrK >, we get 

j K°u ir a ir dH" ” X'a /r , (125) 

which can be considered as an integral equation for the determination 
of the functions a }r and the characteristic values X', replacing the 
system of algebraic equations (123). The result of the elimination of 
the functions a lv from (125) cannot be written in the form of a deter¬ 
minant (123 a) unless we adopt a generalized definition of ‘continuous 
determinants’ corresponding to continuous matrices. Writing the right 
side of (125) in the form J K , a }r $(H' f —IJ') dH", we could then replace 
the compatibility equation (123 a), which serves for the determination 
of the characteristic values of K (K ' = K k > k >), by a symbolic equation 
of the type 

\K R ' ir -~K'h{H'-ir)\ = 0, (125a) 

indicating the general element of the determinant. In the corresponding 
notation for the discrete case, equation (123a) would run as follows: 

\K jrH *—K= 0 . 

Of course (125 a) cannot be used for the actual calculation of the values 
K'\ but this is also true of equation (123 a), since it refers to a deter¬ 
minant which consists of an infinite number of discrete elements. 

We shall indicate later the method which can be used for the approxi¬ 
mate calculation of the admissible values of X' when K differs but 
little from H (as is the case in problems of the perturbation theory). 
It should be remarked here that both for a discrete and a continuous 
spectrum of H the characteristic values of K may form a discrete as well 
as a continuous spectrum (contrary to the assumption which was made 
at the beginning about the discreteness of the X-spectrum). 

It can easily be proved that if the functions a ir [‘characteristic 
functions’ of the integral equation (125)] corresponding to a particular 
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value K' are labelled with this value as second index, they will form 
an orthogonal set—discrete or continuous, together with the set of 
values of K f —and normalizable to unity, i.e. satisfying the relations 

ja u . K .a*. K .dW = S K . K . or S(K’-K’) 
and ^ (iji'o>j £»or j dli. ^ 8(// —H ) 

as the case may be. 

The proof is obtained in exactly the same way as in the case of a 
discrete //'-spectrum dealt with above and therefore will not be repro¬ 
duced here. It should be remarked incidentally that the results referring 
to the latter case must be amended to allow for the possibility of K 
having a continuous spectrum with K k . k * = 8(K'— K"). 

Summing up, we can say that both with a discrete and a continuous 
spectrum of the ‘basic quantity" (or basic trio) H , it is possible to 
calculate the matrix elements of any quantity F from the point of view 
of some other ‘basic quantity’ (or basic trio) K, without the knowledge 
of the characteristic functions of either H or K; the only thing which 
it is necessary to know in order to carry out the transformation from 
F h to F k is the matrix K H . The transformation coefficients a H K > can 
be found from the condition that K k is a diagonal matrix of the discrete 
or of the continuous type (which need not and cannot be specified 
beforehand). 

17. Transformation Theory of Matrices as a Generalization of 

Wave Mechanics; Transformation of Basic Quantities 

It thus appears that the matrix theory, so far as the transformation 
from one point of view to another is concerned, can be considered as 
a logically closed self-supporting structure, which does not need the 
wave-mechanical basis upon which we have built it up. We have 
already met with a similar situation in the preceding chapter, when we 
were discussing the question of the actual determination of the matrices 
corresponding to a given energy operator and found it possible to 
achieve this result by determining the fundamental Hermitian matrices 
of the coordinates and the momentum-components in such a way as to 
make the energy matrix diagonal subject to the commutation conditions 
p x x-~xp x — hferri, etc. 

In the light of the transformation theory developed in this chapter, 
it appears, first of all, that if the latter problem has been solved for 
some simple type of motion specified by the energy operator H , it can 
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be solved for any other type of motion, specified by some more coift- 
plicated energy operator A r , by the method of the transformation 
theory, without getting back to fundamental matrices (x, p s ) and com¬ 
mutation conditions (which, as has been shown above, are invariant 
with respect to canonical transformations). It is just this method of 
solution which is used by the perturbation theory, when the difference 
between the operators K and II is sufficiently small. 

Besides furnishing a simple and practically the only workable method 
for the solution of such perturbation problems, the transformation 
theory reveals a new connexion between the matrix and the wave-mechani¬ 
cal method . reducing the latter to a particular case of the former —as was 
pointed out in the preceding section. We have seen, namely, that 
the characteristic functions or probability amplitudes of the wave- 
mechanical theory 'I'yjr can be considered as the transformation coeffi¬ 
cients from the point of view of the energy-trio’ II to that of the 
‘coordinate-trio’ x (provided that such a thing as the energy exists, 
i.e. that the energy operator II docs not contain the time)—in the same 
sense as the probability amplitudes a irK - are the transformation coeffi¬ 
cients from the point of view of the energy-trio H to that of the energy- 
trio A. This means that the wave-mechanical method can be completely 
replaced by the matrix method involving the transformation of the 
matrices F x to the matrices F Jf or vice versa. 

The wave-mechanical theory, considered as a special case of the 
matrix transformation theory, has to solve the following problem: 
Suppose the matrices of all quantities, and in particular of the energy 
//, to be known from the point of view of the coordinates, we have to 
find the matrices representing them from the point of view of II. The 
solution of this problem reduces to the solution of the linear integral 

ct i uati,,n ’ f dx" = irp y , (iso) 

which is obtained from (125) if A is replaced by II, II by x , and a ir by t/jf, 
and which obviously must be equivalent to the SchrOdinger equation f 

im -nW W'2'--<&//'). (12b a) 

The equivalence of these equations can be proved directly with the 
help of the general definition of the elements of a matrix F c by means 
of the integral p^ = J dx ’_ (127) 

f Wo moan here and in the sequel Sehrodinger’s equation not involving the time (and 
serving to define the stationary states only). This circumstance is indicated by affixing 
to all tho quantities connected—directly or indirectly—with the energy operator K the 
additional (upper) index 0, 
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This definition has been used until now only in connexion with such 
‘key’ or ‘basic’ quantities C, one of which at least could be regarded 
as the energy. This restriction does not seem, however, to be necessary, 
and the formula (127) can be applied to quantities C of any type (pro¬ 
vided the operators by which they are represented commute with each 
other). We can, in particular, put C = x (i.e. C 1 — x, C 2 = y , C a = z), 
subject to the condition that the variables x' and C* in should be 
considered as independent. This means that the two indices (or argu¬ 
ments) in the function (/£, c , need, not necessarily refer to the same point. 

We can thus in (127) put C' = x” and C" = x", or, denoting the 
integration variable by x” instead of x ', write 

(127 a) 

where the oj>erator F is understood to refer to the point x", i.e. to be 

a function of x" and of the elementary operators p x ■>, = ^ . ~ . 

2t ri dx 

The functions must obviously represent the identical trans¬ 

formation (from the point of view of x m to that of x '), or, in other words, 
the probability amplitudes that x should be equal to x'" when it is 
known that it has the value x'. Since one and the same particle cannot 
be simultaneously in two different places, this means that f/'S'v must 
vanish when x" ^ x f and become infinite when x m = x' (in view of the 
fact that x is a continuous variable). We can thus identify with 
the ‘unit matrix’ of the continuous case, i.e. put 

= Hx"'-x'). (128) 

This expression can be derived from the general formula 

$c'"c r ' == j ifix'"!!' a irc dll' 

[cf. (110), § 15] if we put C r = x ' and accordingly a /rjr = a^\^ i/j 
in conjunction with the orthogonality and normalizing relation 
f { f J x , ’n' l Px*r dH' — S(x"'—x'), the being in this case obviously 

identical with 

It is easy to see that, defined in this way, the function 
also satisfies the usual orthogonality and normalizing relations: 

/ 4&-<rtirv dx”’ = 8(C'-C"), j <f>%^ dC' = h{x'-x"). 

In fact, putting C ' — x f and C n = x m , we get, according to (128), 

/ ix*c4l-c-dx’ = J S (x'-x'")8(x'"-x m )dx m = h(x'-x"") 
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and, putting C' = x" f , 

f <t>%4l-v’dC' = J 8(x m —x')b(x"—x'") dx'" = Sfc'-x*). 

We thus see that the elements of a matrix F x can be defined according 
to (127 a) and (128) by the integral 

= / 3 (x'-x"')FS{x'”-x") dx"'; (128a) 

so that, in particular, we have 

H°. x . = | 8 (x'-x m )m(x m -x") dx'", (128 b) 

where H denotes the usual Hamiltonian function of the coordinates x 

Ji 8 

and of the ‘components of the momentum’ p x = — both referred 

to the point x — x w (dx" r indicates the volume-element enclosing this 
point). 

It can now easily be shown that the integral equation (120), together 
with the expression (128b) for its ‘nucleus’, actually reduces to the 
differential eq uation (120 a). 

Let us first take that part of H which depends upon the coordinates, 
that is, the potential energy U{x,y,z). We then get, according to (128 a), 

U”- x . - | U(x'")S(x'-x'")&(x m -x”) dx"' U(x")S(x"-x'), 

which, on substitution in (120), gives 

■ | U". x .f x .dx" =, (*')#.. 

Putting, further, F — 8j8x , we have 

F° x . x .= J Hx'—x'") 8(x'" — x")dx'" — - J S(.r'-.r '")-~,S(F"-x") dx"', 

since, obviously, 

and consequently, 

J F^x-dx" = - J r*dx" J 8(x’—x"') f-~-8(x"'—x") dx'" 

- - J S(.r'— x'") dx"' J As(.r'"-.r") dx". 

Now integrating by parts, we have 



x") dx* = 


Js (x'"-x")±^.dx". 
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because the product ifj^h(x"-~x") vanishes at the limits of integration 
(or at infinity). We thus get 



/ 

/ 


dx‘ 


dx" 


(lx 


w- 

dx" 




In the same way it can be shown that 


f Fy x . l/jJ- fix" ~ 



if F = (ij‘cx)~, and so on. Putting finally F — — "V ( —V + U ~ H 

2 m \2m ox) 

we get J dx" should be mentioned that this formula 

holds identically , i.c. irrespective of the shape of the function i/jy. The 
latter is determined in fact as by the condition that 7/0J, should 

be equal to the product 11'$%. 

The generalization of the matrix tlieory which has been considered 
hitherto consisted, in the main, in admitting quantities other than the 
energy and those commuting with the energy to the role of the ‘basic 
quantities’ determining the matrix representation of all other quantities 
and being themselves represented by diagonal matrices. In the ease 
just considered, this role of basic quantities was switched over to the 
coordinates. The matrices representing the latter x r (or x ryz , y xyzt z ryz ) 
are obviously defined by the equations [ef. (101 b), § 14] 


.r'<$(,r'-.r"). 


(129) 


or, written out in detail: 


a 'xV.-;= Fh(x’-x")Z(y'-y")h{z’-z”) \ 

Vw.W = y’Hx’-x")S(y’-y'')S(z'-z") 1. (129a) 

= z'Hx'-x")W-y’W-z") ) 


The coordinates have, however, preserved at the same time another 
fundamental role in which they have been employed from the very 
beginning—namely, that of the arguments of the functions ip ( ^ c > (with 
C - 77, .r, or any other ‘basic trio ) which can serve for the direct deter¬ 
mination of the elements of a matrix F c by means of eq nation (127). This 
second role of the coordinates is intimately connected with the initially 
adopted representation of physical quantities by means of operators, 
defined as functions of the (rectangular) coordinate** x, z and of the 
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elementary differential operators p r — ~ ~ ~ . A , ^ — iL 

J 1 ix 2m dx u 2m dy 1 ~ 2tu <?z 

which replace the components of the momentum. 

These functions were supposed to be known, being in fact identified 

with the functions representing the same quantities in the classical 

theory (on the ground that F(x,p x )tp reduces to the product F(x,g x ) ip 

if ip is replaced by its approximate expression ip =■ e ii7rSlh , where S is 

the action function of classical mechanics). 

We must now consider a further generalization of the transformation 

theory, consisting in the replacement of the coordinates in this second 

role, connected with the usual operator representation, by some other 

quantities, e.g. Q , associated with operators which contain derivatives 

with regard to Q. 

The possibility—and, more than that, the necessity—of such a 
generalization clearly follows from the fact that the functions \p% c >, 
considered as transformation coefficients ‘from the point of view of C 
to that of x\ or as probability amplitudes for one of these two quantities 
having a given value when the value of the other is known, are practi¬ 
cally symmetrical with regard to both quantities. Instead of—or rather 
together with—the functions ip X ' C ’> we must consider the functions ipt'-y * 
which are simply equal to the conjugate complex of the former and 
which correspond to the reciprocal transformation. In these functions, 
however, it is the quantities G which play the role of the coordinates, 
while the latter appear in the role of the ‘basic quantities’ instead of C, 
Replacing the Schrodinger wave functions ip% tr by transformation 
coefficients or probability amplitudes of the most general type a QT ^ we 
can define the matrix elements of a certain quantity F with respect to 
C by the formulae r 

F 0 c’c = j a*' C 'Fa Q ' C .dQ\ 

or ^c'c m — ^ a Q'c r Fqq'c*) 

according as C has a continuous or a discrete spectrum. 

This definition will, however, remain meaningless so long as F is not 
specified as an operator ‘from the point of view’ of Q , i.e. as a certain 
function of Q(Q v Q 2 >Qz) and the derivatives d/dQ . The operators 
which have been considered hitherto have always been specified from 
the point of view of the coordinates x , and obtained from the classical 

fi d 

functions F(x , g x ) by a simple substitution of p x = —. — for g x . Adopt- 

2 ?n dx 

ing what can be denoted as the ‘principle of relativity’ with regard to 
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the ‘basic quantities' which specify the operator representation, we 
shall denote the operator representing a certain quantity F ‘from the 
point of view of Q 5 by F ( g )t where the brackets are introduced to dis¬ 
tinguish this operator from the corresponding matrix F Q . The operators 
defined in the usual way, i.e. from the point of view of the coordinates, 
should be denoted accordingly by *u> and the general definition of the 
elements of the matrix F c by means of the operator F<v) should run 
as follows: 

Fc'c* = J a */c' q'c dQ ~~ J a cv r F<9)«Q'cr dQ (130) 

if the spectrum of Q is continuous, or 

n<V " 2 °Q'C ^Q) a Q C” “ 2 a C'Q' F(Q) a V'C'’ (130a) 

<7 o' 

if it is discontinuous. 

Another obvious condition for the operators F iQ) is that the matrix 
elements of F c defined by the preceding equations should not depend 
upon the choice of the quantities Q. 

Equations (130) and (130 a) bear a striking resemblance to the trans¬ 
formation equations 

Flv = JJ ai^.F^.a^dQ'dQ", 

and ^e'e “22 a c'Q' a Q*c•> 

or, in the abbreviated notation based on the matrix multiplication law', 
F c ^a-'Fga^a'Fg(i, 

with a denoting the transformation matrix a Q > c >. 

The equations of both types actually become identical if the operators 
F«i) satisfy the condition 

$(Q) a Q'C” ~ J dQ", (131) 

or F ( Q)CtQr C . = ^ F(fQ” a Q'c 0m (131 a) 

These conditions are a generalization of the equation 

J = HK,, 

which has already been obtained in connexion with the proof of the 
equivalence of the SchrOdinger equation (J 26 a) with the integral eq ua- 
tion (126). It should be observed that, according to the present notation, 
we must write H ix) for the energy operator, and a^ ir for the wave 
functions Further, we easily get as a generalization of equation 
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(128 b) the following relation between the operator F {Q) and the matrix 

Fq ' f W = / HQ'-Q'") F w W'-Q*) <W, (i3i b) 

where the functions h(Q/ — Q") can be considered as the transformation 
coefficients a Q . Q * on the assumption that the spectrum of Q is con¬ 
tinuous. The formula (131b) can be considered as the direct consequence 
of (130). 

Putting F = C in (131) and taking into account that 
/ Cl. Q .a Q . c .dQ"C"a Q . c . 

according to the definition of the transformation coefficients a Q * c * [cf. 
equations (125) and (126)], we get 

(\q) a Q'c” = C n a.Q' (J *. (132) 

This equation is the broadest generalization of SchrOdinger's equation, 
with C standing for //, Q for x , and the probability amplitudes a QXr 
(which could also be denoted by </fj^ c ») for the usual ‘wave functions’ 

Ji 3 

If the form of the operator C iQ ) as a function of Q and of —. — 

is known, equation (132) can serve to determine the functions 
a g ,(--~ a Q ' C ~) and the characteristic values C” of the operator C {Q) . It 
should be remarked that these characteristic values do not depend upon 
the choice of the basic quantities Q (i.e. are invariant with regard to the 
transformation of the latter), being as a matter of fact nothing else but 
the characteristic values of the operator C ic) , or, in other words, the 
(diagonal) elements of the matrix C c . This corresponds to the physical 
meaning of the characteristic values of a quantity, as the values which 
this quantity can possibly assume, irrespective of the values which can 
be, or actually are, assumed by any other quantities. 

In deriving equation (132), we have assumed that the characteristic 
values of Q constitute a continuous set. If they constitute a discrete 
set, the differential operator representation of different quantities F with 
regard to Q becomes impossible , for the application of the derivative 
operators djdQ to functions of Q becomes meaningless. Equation (131 a) 
can hold accordingly only when the operator F {0) reduces to a function of 
Q. The same refers to the equation, Ffy Q which 

should replace equation (131b) and which is meaningless, unless the 
operator F^ reduces to a function of Q (not containing the derivatives 
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d/dQ), in which case it reduces to 

Fq'q 9 ~ F(Q")&q’q , '> 

meaning that F () is a diagonal matrix. 

This example shows that the matrix theory, which we initially de¬ 
veloped on the basis of the operator theory, starting with the energy 
operator II (x) and the wave functions defined by it according to Schrti- 
dinger’s equation H {x) ip x/ ~ is actually more general than the 

operator theory even in its generalized form corresponding to the 
replacement of the coordinates x by some other trio of quantities with 
continuously variable values.f 

Another and perhaps logically more satisfactory procedure would be 
to start (following Heisenberg, Jordan, and Dirac) from the other end, 
i.e. with the matrix representation of physical quantities, deriving the 
operator representation as an alternative form of it for the case when 
the basic quantities admit continuously variable values, and using the 
transformation theory for the definition of the probability amplitudes 
a QC ' and, in particular, of the wave functions ^xjr °f the dc Broglie- 
Schrddinger wave-mechanical theory. 

This purely deductive method has, however, from a didactic point 
of view, the disadvantage of being too abstract and of starting with 
ideas completely alien to customary or ‘classical’ conceptions. The 
inductive method, which is adopted in this book, and which makes an 
appeal not only to the logic but also to the intuition of the reader, 
gradually leading him from the concrete customary conceptions to the 
abstract new ideas, may prove more helpful for those who have to get 
used to these new r ideas and perform the logically simple but psycho¬ 
logically difficult task of getting rid of the old conceptions. 

To this it should be added that the matrix theory remains an empty 
scheme so long as no concrete assumptions are made about the com¬ 
mutation properties and the functional relationship of the matrices 
concerned, the problem consisting in the actual determination of the 
elements of these matrices from a certain ‘point of view’ (after which 
a transition to some other point of view r and the determination of the 
corresponding probability amplitudes can be made with the help of 
the transformation theory). These assumptions, however, involve con¬ 
siderations which lie outside the logical realm of the matrix theory and 
can hardly be understood without the fundamental idea of the wave- 

t It would be possible to extend the operator theory to the discrete case if differential 
coefficients were replaced by finito differences. 
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mechanical theory, namely, that the motion of a particle in a given 
field of force is determined in terms of probabilities by the propagation 
of the associated waves. 

This refers in particular to the commutation relations between the 
fundamental matrices x and p , 

px—xp = 8 (133) 


in conjunction with the fact that the latter have to be defined as the 
components of the momentum in the classical expression of the energy 
II (replaced by the matrix H x ). 

After these relations, which correspond to the quantum conditions 
of Bohr’s theory, have been established, the whole problem of the 
wave-mechanical theory can be stated as the transformation of all the 
matrices involved (and in the first place of x , p , and II) from the point 
of view of x to that of //, the transformation coefficients i/rj 7r being 
the probability amplitudes of finding the particle in a given position 
when its energy is known or with a given energy if its position is known. 
The actual solution of this problem is usually reduced to the solution 
of SchrOdinger’s equation involving the operator H u) . 

As an illustration of the ‘principle of relativity’ with respect to the 
basic quantities in the operator representation, we shall consider the 
results which are obtained if the coordinates are replaced in this role 
by the momenta p. The latter must be considered in this case as 
ordinary quantities (—(?), while the coordinates, in order that the 
‘quantum conditions’ (133) should be satisfied, must be defined as 
differential operators according to the formulae 


h d _ h d 

2m ()p x ^ 2m dp y 


h d 
2m &p z 


(133a) 


The energy operator H ip) can be determined accordingly as the operator 
resulting from the substitution in the classical Hamiltonian function 
(Pl+Pl+Pl)l(2 m ) + U( x >y> z ) of the elementary operators (133 a) for the 
coordinates. The new wave functions corresponding to this defini¬ 
tion of the energy operator are determined by the differential equation 
[cf. (132)]: 

H(p)*l>p- = (133 b) 

which in general is entirely different from that of SchrOdinger—since 
the kinetic energy {p%+Py J rPl)H^rn) which in the ^-representation 
reduced to the Laplacian differential operator of the wave theory 
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V 2 [multiplied by —A 2 /(87r*w&)], in the -representation remains an 
ordinary quantity, or more exactly an ordinary factor which has to be 
multiplied by the function 0", while the potential energy becomes a 
differential operator acting on this function, the result of the operation 
H (p) being equivalent to the multiplication of by a constant factor 
//'—one of the characteristic values of H. As stated above, these 
characteristic values must be the same whether wc start with the basic 
quantities x or p. 

The probability amplitudes arc, however, in general, functions 
of p' entirely different from the ordinary wave functions (with the 
exception of the case of the harmonic oscillator, where the potential 
energy is the same quadratic function of the coordinates as the kinetic 
energy is of the momentum components). According to the funda¬ 
mental equation of the transformation theory [see, for instance, (119 b)] 
they must be connected with each other by the relations 

i'x’ir — J °x'p ,l K '//' dp' 

^p'H' — j a p' 1 rl / J SH' dp' — J a*/ p ' 'I’x'ir d x ' 



where the transformation coefficients a X J) . can be defined by the operator 
' qU, “ 0 ”' (134.) 


that is, 


P(x) a x’ = P a x'> 
h c) , 


h d_ 
frf*' 0 ' 1 ' 


~ Py G 'x’v'z'> 


h d 

2rri 


■ Pi a x 


This gives 


= —e i2rr * >,j: 'l h i 


(134b) 


p'x' denoting the scalar product of the vectors p and r, i.e. the sum 
Px x '+Pvy'^~Pz z '- The coefficient 1/VA follows from the orthogonality 
and normalizing relation 

J a xV a *p- dx> = Hp’-p"), or J a x , p .a*. p . dp' = 8(x'-x”). 

The same result is obtained if the functions a x > v ', or rather are 
defined by the operator equation 
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which, because x p — 


A £ 

'llT ( ] p ’ 


gives 


1 _ 1 s>-i2TTX’jj'lh 


(134c) 


in agreement with the relation a~> T . — a* 

Substituting these expressions in (134), we get 

ft//' - ~ J r,nr^ l/yl " <V 

= -t j* eJ-rt "'’ T ' * "• ^ >'i ‘•'V* ,/y,; f /p^ (130) 

= JL j i lt° r . H .e-' 2w > J ' y l' 1 dx' — -L J ^, n s , ]r e-iZvU‘‘,s \v', v‘ i (<;='>/* d.r'dy'dz'. 

(135a) 

The first of these formulae can obviously be regarded as the expansion 
of the function i)j ® 7/ , in a Fourier integral with the amplitude cootficients 

-U 7/ , while the second gives the explicit expression of these coeffi- 
xh 

dents. Remembering the wave-mechanical interpretation of the vector 
p 'jli as equal to the reciprocal of the wave-length and pointing in the 
direction of the propagation of the waves associated with the motion 
of the particle, we can regard the transformation coefficients a x/pf as 
plane sine waves (without the time factor, however!), and we can inter¬ 
pret the transformation equation (135) as the representation of the 
wave function by means of a superposition of plane sine waves 
with appropriate amplitudes and travelling in appropriate directions. 
This physical interpretation is in complete harmony with the physical 
meaning of the Fourier amplitudes as the probability amplitudes 
for the particle to have a definite momentum p' (irrespective of its 
position) for a given value H ' of its total energy to which the function 
't’x'TV refers. 

We shall not consider in further detail the generalized transformation 
theory and its application to operators other than x , p , and //. There 
is, however, one particular class of transformations which have been 
alluded to at the end of § 16 as ‘canonical transformations of the second 
kind’ and which deserve special notice. They consist in a transition 
from the original trio of (rectangular) coordinates ( x ) and the associated 

momentum operators {p x = A £j to some new basic trio of mutually 

commuting coordinates (Q) and mutually commuting momenta (P) 
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satisfying the commutation relation 


PQ—QP — t> -~. 3 (136) 

Z7Tl 

for a given motion specified by a definite energy operator 
H < X ) = B ( Jpc,y,z-,p x ,p r p.) 

which is thereby transformed into H {Q) = H { q)(Qi, Qo, (? :i ; P 1( P 2 .Ps)- 
The quantities P and $ satisfying the above relations are said to be 
‘canonically conjugate’ with each other. From the point of view of the 
new coordinates (Q) the new momenta (P) are represented by the 
h d 

operators P {Q) — —. — (just as the Q 's are represented from the point 

of view of the P's by the operators Q (P > ■= — --- . An operator 

representation of the P's from the point of view of the original co¬ 
ordinates (x) is, however, possible in the particular case only when the 
Q 's are defined as certain functions of the x's not involving the p x 's or 
the P's. In this case, which corresponds to the ‘point transformation' 
of the classical theory, the new momenta (P) can be expressed as certain 
functions of the original ones p x (involving as parameters the co¬ 
ordinates x or Q). In the general case of a canonical transformation 
corresponding to a ‘contact transformation’ of the classical theory such 
a relationship between the new and the old variables does not exist and 
some kind of matrix representation must be used for the definition of 
the latter. The relationship between the new and the old variables can 
be expressed with the help of a certain transformation matrix <f> according 


to the equations q 
that is, 

Pi = o-^< j>, p, = p 2 = <d-^ s <i> r { a) 

These equations automatically secure the fulfilment of the commuta¬ 
tion relations which must exist between the new variables 






(136a) 


0, P i P k —P k P i = 0, P t Q k -Q k P t = JL 8 ( , (136b) 


as a consequence of those existing between the original ones. 

In order that the new variables should be represented by Hermitian 
matrices just as the original ones, the transformation matrix 0 must be 
unitary , i.e. satisfy the relation O -1 = O t . 

The equations (136 ul are thus formally quite similar to the equations 
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(124) of § 16. They have, however, an entirely different physical mean¬ 
ing. While the transformation matrix a in (124) has a mixed character 
referring to two different sets of states, the elements of the matrix O 
refer to the same set of states specified by the characteristic values of 
some basic quantity which serves for the definition of the matrices 
x, Vx > Q> an d H (this basic quantity can in particular coincide with 
the invariable energy H). 

The equations (136 a) must be considered as corresponding to the 
d(t> dQ> 

classical equations^ = —Q — — [cf. (31 a), § 4] defining a contact 
dx dP 

transformation with the help of an arbitrary function In the quan¬ 
tum theory the latter is replaced by the likewise arbitrary transforma¬ 
tion matrix <t>. 

In the classical theory a canonical transformation is characterized by 
the fact that it does not alter the canonical form of the equations of 
motion. The same criterion is easily seen to apply to the canonical 
transformation (136 a) of the quantum theory. 

We have, in fact, differentiating Q and P with respect to the time t , 


v -[*•«)• 


dP 

dt 


in, n 


which in virtue of (136) can be written in the form 

dQ dH dP _ _dH 

dt ” dP’ dt ~ dQ 


[cf. § 7, eqs. (43 a) and (44c)]. 

An equivalent form of the condition that the variables P and Q 
should be canonically conjugate (in the classical sense), i.’e. that they 
should satisfy the canonical equations of motion, is that the Poisson 
bracket expression 


[A,B]=2{ 


8A 8B 
8p x dx 


8A 8B\ 
dx dpj 


should be equal to 1 for A = P iy B = Q t (i — 1,2,3) and to 0 for all 
the other combinations of the variables P, Q. This condition corre¬ 
sponds to the commutation conditions (136b) which can be written in 
the form [Q i9 Q^\ = 0, [P i? P k ] — 0, \P 0 Q k ] — B ik , the classical Poisson 
bracket being the analogue of the quantum bracket expression 

[A,B] = ^^(AB—BA) (cf. § 8). 

3585.6 v 
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18. Geometrical Representation of the Transformation Theory 

The understanding of the generalized matrix theory, connected with 
the ‘principle of relativity’ in the choice of the basic quantities and 
with the transformation from one ‘basis’ to another, can be greatly 
facilitated by the use of a geometrical picture, or rather of a geometrical 
language, suggested by the formal similarity between the equations of 
the transformation theory developed in the preceding sections and the 
theory of linear orthogonal transformations of ordinary analytical 
geometry. The nucleus of this analogy is that in both cases the trans¬ 
formation equations are linear (and homogeneous) and that the 
transformation coefficients satisfy similar orthogonality and normalizing 
relations. (The mere idea of ‘orthogonality’ is suggestive of mutually 
perpendicular axes.) 

The choice of the basic quantities in the present theory corresponds 
to the choice of the coordinate system in the geometrical theory, and 
the relativity in the choice of these basic quantities corresponds to the 
relativity in the choice of the coordinate system—or, in other words, 
to the equivalence of all the directions in space. 

It will be remembered that in analytical geometry a linear orthogonal 
transformation means a set of linear homogeneous equations between 
the coordinates x = x v y = z = x 3 of an arbitrarily chosen point 
with respect to one system of axes, S, say, and the coordinates of the 
same point £ = £ v 77 = £ 2 , £ — £ 3 with respect to another system X, 
both systems being orthogonal and having the same origin. These 
equations can be written in the form 

£v 2 d'nv \ 

" , (137) 

OT x n = 2 £y j 

with a nv = a~l = eos(z n ,£J. (137 a) 

The relations a nv = which are geometrically evident, can be ob¬ 
tained analytically from the orthogonality condition 

2*S = 2ft (137 b) 

n v 

which gives, in conjunction with (137), 

2 «»v = K'n’> 2 «£v a ?'n = 8 . V (137 c) 

iv n 

On the other hand, substituting the expressions of the f’s in those of 

the x 9 s and vice versa, we have 

2 ®n'v^vn" ~ bn'H'i 2 ®v'n®nv“ (137 d) 
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The comparison of these equations with the preceding equations leads to 
the relations (137 a), without, of course, the geometrical interpretation 
with which we started. 

The transformation theory which has been developed in the preceding 
sections can be obtained from this elementary theory of linear ortho¬ 
gonal transformations by a twofold generalization. 

Firstly , by making the number of coordinates specifying a point 
infinite , i.e. by considering, instead of the ordinary three-dimensional 
space, a fictitious space with infinitely many dimensions. 

Secondly , by considering the coordinates of a point as complex 
quantities and by defining the square of its distance from the origin, 
not as the sum of the squares of the coordinates, but as the sum of 
the squares of their moduli , thus replacing the orthogonality condition 
(137 b) by the following condition: 

Z X n X t = (138) 

n v 

the summation being extended over all the coordinates. We get in this 
case, instead of (137 c), 

»' n 

and, since equations (137 d) are not altered, 

= <• or a nv — a~ n **, (138 a) 

that is, tf 1 = a''. 

In the special case of real coordinates x , f, this ‘unitary’ transformation 
reduces to the usual orthogonal transformation (though with an un¬ 
limited number of variables), and we get a 1 a* — a (transposed 
matrix), that is, or 1 ~ a , which is another expression of the relations 
(137a). Although a geometrical interpretation cannot be associated 
with an infinite number of complex variables x, £, connected with each 
other by a unitary transformation, yet, since the number of variables 
does not make any difference from the purely analytical point of view 
(so long as it is larger than 1), we can preserve, if not a geometrical 
picture, at least a geometrical language with respect to the variables 
x, f and the transformation coefficients n av . Wo can accordingly regard 
(or rather denote) the former as the coordinates of a point in a space 
of infinitely many dimensions with respect to two orthogonal systems 
of coordinates S and X, while the latter can still be regarded (or denoted) 
as the cosines of the angles between the old and the new coordinate 
axes. The variables x n and can be defined also as the projections 
(or components) of a certain vector r on these axes. 
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In the simplest matrix transformation problem which was considered 
at the beginning of § 15, the role of the coordinates x n and £„ is played 
by the characteristic functions (or rather amplitudes) t//° r , This is 
clearly seen from the fact that they are transformed according to 
equations (110) and (111) which are the analogues of equations (137), 
and that they satisfy the orthogonality relation (113) which is exactly 
of the same type as (138). We can thus describe the matrix transforma¬ 
tion theory in a very suggestive geometrical language, according to the 
following principles. 

Each stationary state specified by a wave function can be repre¬ 
sented geometrically by a certain direction or axis //' in a space of 
infinitely many dimensions, which we shall call the state-space. The 
states specified by the different functions i/s° fr are represented by axes 
H' which are perpendicular to each other, the complete set of states 
defined by the operator H forming a complete orthogonal system of 
coordinate axes in the state-space, which we shall also denote by the 
letter II. The ‘completeness’ of the system means that any ‘vector’ in 
the state-space can be represented as the geometrical sum of its com¬ 
ponents along the axes of H . 

This applies in particular to vectors drawn in the directions of another 
complete orthogonal system of axes K ', which represent geometrically 
the stationary states defined by the operator K. The transformation 
coefficients a ir K > can be regarded as the projections of a unit vector in 
the direction of a definite axis K ' on the different axes II ' or, loosely 
speaking, as the cosines of the angles between the axes K' and //'. The 
latter expression requires, however, a correction, inasmuch as the co¬ 
efficients a^ H ' = a *rK' can a ^° pretend to the same role, for they 
represent the projection of a unit vector in the direction of a certain 
axis H' on the different axes K'. This interpretation of a irK , and a^. H , 
immediately follows from the comparison of the transformation equa¬ 
tions (ffjz' — ^ ^h'k’^h' a *ifi fA//' ~ ^ a K’ir ‘A!" with (137). 

It should be remembered that the quantities r»- and <f>^ appearing 
in these equations in the role of rectangular coordinates of a point in 
the state-space are themselves functions of the ordinary spatial co¬ 
ordinates x , y, z, and that, moreover, they refer to the same (arbitrarily 
chosen) point. 

So long as this point remains unspecified, 4>V and (f>° K > can be treated 
as vectors , but as soon as we specify it, putting x = x\ we get numbers 
'Px’ir an d ^xK' which, as we know, both with regard to their physical 
meaning (as probability amplitudes) and analytical properties (as trans- 
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formation coefficients), are wholly similar to the numbers a n K '. We 
can regard them accordingly as the components of the vectors an< ^ 
along the axes of a third coordinate system X in the state-space, 
each axis x' of this system specifying a definite position x —- x\ y — y\ 
z = z' of the particle in the ordinary space. The axes of this new system 
X must be regarded as orthogonal (i.e. mutually perpendicular) in spite 
of the fact that they correspond not to a discrete set of states, like the 
axes of the system H or K, but to a continuum of states. 

Since the functions j p\. IV and <j>^ K > are normalized to unity, both with 
respect to x and to H or K, the vectors as well as 0°,, <f>y (the 

latter specifying a certain position in space irrespective of the values 
of the energy II or K) can be regarded as unit vectors (i.e. having the 
length unity) and the numbers \jP x r jr and 4>%k' interpreted geometrically 
in the same way as the numbers namely, as the cosines of the 

angles between the axes x’ (not in the ordinary space of course, but 
in the state-space!) on the one hand, and between the axes //' or K' 
on the other. 

From this point of view the transformation equations 

fix'K' ~ X 'l*x’ir a ll'K' ] 

ir ( 139 ) 

'Px'ir — 2 ^x'K' a K’ir \ 

acquire an extremely simple geometrical meaning: they become, namely, 
the generalization of the well-known formula of analytical geometry for 
the cosine of the angle between two directions, x' and K', say, expressed 
in terms of the cosines of the angles between these directions and a 
complete set of mutually perpendicular directions constituting a co¬ 
ordinate system II. 

In fact, if we write cos(a/, K'), cos(x',//'), and cos(//', K') instead of 
$r'A~’ an( l a irK' respectively, the first of equations (139) assumes 

the familiar form 

cos(^',X') — cos(x', H')cos(H', K f ). 

It becomes, however, necessary to distinguish two different cosines 
between the same two directions (corresponding to the projection of the 
first on the second or the second on the first), since — cos(A r ',//') 
is not equal to a JVK . = cos(//', K') but to its conjugate complex: 
cos (K', H f ) = cos*(//', A'). (The same refers, of course, to the functions 
and or tin- and $&>•) 

Following Dirac, we shall often use in future the simplified notation 
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(A'(AT) and (#'|A r/ ) for these two 
cients; we shall write likewise 


§18 

cosines’ or transformation coeffi- 


= (*>') 

thr = (*'l H') 

thus avoiding the unnecessary complications arising from the use of 
different letters, a, ip°, </>°, etc. The unit vector (in the state-space) 
defining a certain state x\ //', or K f per se , i.e. irrespective of the other 
states with which it can be associated, will be denoted accordingly by 
the symbols (z'l), (/7'|), (A'|) or ((a:'), ([//'), (|A r/ ). This notation has 
the advantage of representing the same thing by the same symbol (or 
two ‘conjugate’ symbols), while in our previous notation the same state 
corresponding to a given position x' was described by two different 
symbols i/r^ or depending upon the ‘coordinate system’ H or K 
which we had in mind. 

With the new notation the transformation equations (139) can be 
written in the form 


(,r'|A") - | (x'\H')(H'\K') 
= 2 (x'\K')(K'\H') 

A" 


(139a) 


Since the three coordinate systems //, K, and x are equivalent to each 
other, we could write by analogy a third relation of the same form, 

namely, (i/'|A") = £ (H'\x')(x'\K'), 


if .r' were discretely variable, like //' and K'. Since, however, x' is 
continuously variable, we must replace the sum by an integral over 
x ', which gives 

(H'\K‘) = J (H'\x')(x'\K')dx', (139 b) 

or, in the previous notation, 

a U K' = / dV, 


which is nothing else but the formula (110a) obtained at the beginning 
of § 15, and again in the way just shown—but without the associated 
geometrical interpretation—somewhat later. 

The preceding equations (139a) and (139 b) hold, of course, for any 
three sets of states which may be specified by three basic ‘trios’. It 
should be remembered that, from the physical point of view, they 
express the addition and multiplication law for the probability ampli¬ 
tudes. The geometrical interpretation of the probability amplitudes 
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(Q'\C') as the cosines between the directions Q = Q' and C — C in 
the state-space is in perfect harmony with the initial interpretation 
of the orthogonality between two functions representing two different 
states as the expression of the alternative character of these states. All 
those states which are represented by mutually perpendicular directions 
in the state-space are alternative or mutually exclusive—in the sense 
that the probability of finding the particle in one of them when it is 
known to be in another is equal to zero. All such states may always 
be referred to the same set. 

Having elucidated the geometrical meaning of the probability ampli¬ 
tudes—or transformation matrices—we shall now turn to the geometri¬ 
cal interpretation of the ordinary matrices, which represent physical 
quantities from one or the other point of view. This interpretation is 
again determined by the transformation equations (121) which show 
that Hermitian matrices can be considered as a generalization of the 
so-called tensors, or more exactly symmetrical tensors , of the elementary 
three-dimensional analytical geometry. 

A tensor can be defined as a composite quantity with a number of 
components, each of which refers to two axes of the same system of 
coordinates, and behaves with respect to a transformation of the co¬ 
ordinate system in the same way as the product of the components of 
two vectors along the corresponding axes. 

Let us consider again the two coordinate systems 8 and X and denote 
the components of the same vector, f, say, along the axes of 8 and X 
by f n and f v respectively. If g is some other vector, and if we form 
the products of all the components of / with all the components of g , 
referred to the same system , we shall obtain a set of 9 quantities 

T mn =fm9n or Tp V — g„, (140) 

which can be considered as the components of the same tensor T 
referred to, or represented from the point of view of, the coordinate 
system 8 or S. Taking into account the transformation equations, 

fix ” 2 ^mixfin’ 9v 2 ^nv 9n"> 

* in n 

with the coefficients a uv ~ a~l = cos(x u , f tt ) as before, we get 

~^fxv ~ 2 X IL (140 a) 

in n in n 

and T mn = 2 £ ^, = 12 a^T^a^. (140 b) 

fX V h v 

These transformation equations can serve to define a tensor T in the 
general case, when its components cannot be put in the simple form 
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(140). These equations can obviously be written in the following matrix 

form: T, =-. a - 1 T s a, T s = aT s a~\ 

which makes it evident that a matrix F c representing some quantity 
F from the point of view of some other basic quantity G , can bo inter¬ 
preted geometrically as a certain tensor F in the state-space referred 
to a system of coordinates whose axes represent the states specified by 
the characteristic values of C. 

The matrices F c representing real quantities are Hermitian, i.e. 
satisfy the relation „ 

]*C”C‘ — * C'C”> 


which can be considered as the generalization of the condition 
T T T = T 

nut nnu fi \> vfi 

for the symmetrical tensors of ordinary analytical geometry. 

Now such tensors admit of a very simple and suggestive geometrical 
illustration, namely, that of a central quadric (ellipsoid, hyperboloid), 
defined by the equation 

IIT mn x m x n = 1, (140 c) 

m it 

in the coordinate system S, or 

1. (140 d) 

n v 

in the coordinate system X. 

The fact that these two equations represent the same surface, i.e. that 
the coefficients T mn and T flv are transformed into each other according 
to equations (140 a) and (140 b), can be proved by substituting in 
(140d) the expressions ^ fv = '2, a *» x w which gives 


^ X 2 ^iiv ^nv x m X n ^ 

ft. v VI 11 

or, changing the order of summation with regard to the Greek and 
Latin indices, . N 

22 x m^n[2IT^a a„,\ = 1, 

m n, ' ll v 1 


which, in view of (140a), coincides with (140c). 

The components of a symmetrical tensor referred to a system of 
coordinates can thus be interpreted as the coefficients in the equation 
of a certain central quadric referred to the same coordinate system; 
this makes it possible to visualize a symmetrical tensor , without any 
reference to a system of coordinates , as the quadric surface which it defines . 

It should be mentioned that a quadric surface can be defined, accord¬ 
ing to (140c), by a non-symmetrical tensor just as well as by a sym- 
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metrical one. But it will actually contain the sum of the components 
T un{ referring to the coordinates x m and x n as the coefficient of 
their product x m x H . The asymmetry of T , if any, will therefore not be 
manifested in the shape of the surface, or, in other words, the latter 
will define only the symmetrical part of T. Thus a tensor can be com¬ 
pletely specified by a quadric surface only when it is symmetrical. 

Every central surface of the second order has three mutually per¬ 
pendicular axes of symmetry, which can be defined by the condition 
that, referred to a system of coordinates £ whose axes coincide with 
its symmetry axes, the equation of the quadric reduces to the ‘canonical’ 
form yy « . 

jLt Ifi 


A 4 

not containing products of different coordinates. 

This can be expressed by saying that the matrix Tz considered from 
this point of view is diagonal. The possibility of reducing the equation 
of a central quadric to the canonical form, i.e. the existence of symmetry 
axes, is proved by a well-known method which at the same time leads 
to the actual determination of the cosines between these axes and the 
original axes x lt , i.e. of the coefficients of the orthogonal transformation 
S --> 2 ^, and of the diagonal elements of the transformed matrix, or, in 
other words, of the characteristic values of the tensor T, T { 1 — T\ 

This method consists in defining the vertices of the quadric—i.e. the 
end-points of the symmetry axes—by either one of the following con¬ 
ditions: 

( 1 ) The normals to the surface at the vertices coincide in direction 

with the radii vectores from the centre. This condition leads to the 

equations ^ 

- 7 — proportional to x uv 
dx m 


where F denotes the left side of equation (140 c), or, if the propor¬ 
tionality factor is denoted by T': 

I T mu x n - T’x m . (141) 

n 

80 long as we are dealing with ordinary three-dimensional space, this 
is a set of three linear equations which are compatible with each other 
if their determinant vanishes. The latter condition gives a cubic equa¬ 
tion for T and to the three roots of it there correspond three sets of 
x n values, x nT >, say, which define three mutually perpendicular vectors, 
and reduce to the cosines of the angles between the old axes and the 
symmetry axes if normalized to unity. The three values of T' turn out 

3S95.6 7 
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to be the three non-vanishing diagonal elements of the transformed 
matrix or tensor T 

(2) The distances of the vertices from the centre or their squares 
r 2 — ]£ xf n have the largest or smallest possible values, consistent with 


the equation 


F s 1 T mn x m x n — >■ 


This gives, with the help of Lagrange’s method of undetermined 
multipliers, a system of equations derived from 

8 r 2 +ASF = 0 (141a) 

by equating to zero the coefficients of the variations of the separate 
coordinates with a properly chosen value of the coefficient A. Putting 
A — T 7 ', we again get equations (141). 

It should be mentioned that the variational equation (141a) can be 
interpreted as the condition that F should have a maximum, minimum, 
or stationary value while r 2 is kept constant, for instance equal to unity. 

(3) Finally we could find the symmetry axes of T by defining the 
transformation coefficients a fn , in equations (140 a) in such a way that 
the three transformed non-diagonal components of T vanish, or, in 
other words, that the transformed matrix T% be diagonal. This again, 
as can easily be shown, leads to equations (141) or, more exactly, to 

2 ^mn x nT' “ T'X U}T : 


These equations, as well as equations (141), are obviously of the same 
type as equations (122 b) or (123) of § 16 defining the transformation 
of the matrix K u to the diagonal matrix K k . They only differ in the 
number of dimensions, this being equal to three in the case of ordinary 
space and to infinity in the case of the state-space to which the latter 
equations refer. Another difference between them and the correspond¬ 
ing elementary equations is that the vectors and tensors with which 
we have to do in the case of the state-space are complex, the symmetry 
condition for the ordinary tensors being replaced by the Hermitian 
condition for the tensors in the state-space. 

With this amendment, which from the purely analytical point of view 
is merely a trivial generalization of the ideas and relations of ordinary 
analytical geometry, we can apply the tensor idea and the idea of a 
quadric central surface in the state-space for the representation of 
physical quantities which have hitherto been represented by Hermitian 
matrices. The idea of a tensor, together with the ‘principle of relativity’ 
in the choice of the coordinate system, is actually equivalent to the 
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idea of a matrix in conjunction with the principle of relativity of the 
basic quantities which determine the coordinate system. 

The additional feature of the geometrical representation derived by 
generalizing the ordinary geometrical theory is the possibility of think¬ 
ing of a quantity F as pictured, as it were, by a central quadric surface 
in the state-space, the axes of symmetr}^ of this surface representing 
the different states specified by the characteristic values of F y and these 
characteristic values being inversely proportional to the squares of the 
length of these axes drawn from the centre to the vertices (without 
being prolonged to infinity). The latter relation follows from the fact 
that in the canonical form of the equation of the quadric £- — 1 

H- 

the coefficients T flfx which are obviously the reciprocals of the squares 
of the lengths of the axes (with positive or negative sign) represent at 
the same time the characteristic values T' (or TT” , T'") of the 
tensor T. 

The equation of a quadric surface representing in the state-space 

a certain quantity F referred to the symmetry axes of the quadric 

surface which represents some other quantity, C , say, can be written 

in the form r r * , 

> 2 a ?r a c M = const., 

<7 tr 

if the values of C form a discrete set, or in the form 


(142) 


// 


Fq’q)»(Iq* cZC dC — const., 


(142a) 


if they vary in a continuous manner, while the expression 

E = (142 b) 

r* 

or R = ja*.a c .dC’ (142c) 

can be interpreted as the square of the distance from the common 
centre of the two surfaces to some point with the coordinates a c >. 

The characteristic values of F and the states specified by them can 
be found by transforming the quadric (142) to the canonical form, i.e. 
to the symmetry axes of F . This problem, as we know already, is 
solved by the transformation equations 


or 


Fl y(r a c . — F'a c . \ 
J F%. c .a c . dC’ = F’a c . j 


(143) 


the resulting normalized a c . — a c > F > = (C'\F') being the cosines of the 
angles between the symmetry axes of C and those of F, or, from the 
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physical point of view, the probabilities of getting a certain value for 
C when that of F is supposed to be known. 

An important relationship between the two quantities is expressed 
by the coincidence of the symmetry axes of the associated surfaces. 
This means the coincidence of the states specified by the corresponding 
characteristic values of F and C and is equivalent to the condition 
that F and (7, defined as matrices or operators from any common point 
of view (Q say), commute with each other. To prove this we shall first 
put Q = C. The matrices F c and C r , being both diagonal, must com¬ 
mute with each other, since their product is also a diagonal matrix, 
independent of the order of the factors: 

(FC)cc’' ” Fcc'C'C'cr^c'cr ™ (^^)cre"* 

Now when Q ^ C one can always define a (unitary) transformation 
matrix b which will transform C into Q according to the equation 
Q = bCb~ l . According to the invariance property with regard to 
canonical transformations of this form expressed by equation (124 a), 
we must have 

F q C 0 ~C q F q = b(F c C c -C c F,,)b-' = 0. 

The transformation equations from C to F in the general case when 
these quantities do not commute can be derived from a variational 
principle of the same type as that which serves to determine the vertices 
of a quadric in ordinary analytical geometry. We can put, namely, 
8 E 0, subject to the condition (142) or (142 a) giving 

hF-F’SE^O, (143a) 


where F denotes the left-hand side of ( 142 ) or (142 a) and K the expres¬ 
sion (142 b) or (142 c) respectively, while F' is an undetermined multi¬ 
plier. This equation can also be interpreted as expressing the fact that 
SF = 0 subject to the condition that E = const. (— 1 , say). The 
variations of a c > and a*, must be considered as independent of each 
other and their coefficients in (143 a) set equal to zero, which leads to 
the transformation equations (143) and their conjugate complex (i.e. the 
equations of the reciprocal transformation). 

The ‘conditioned’ variational equations SE = 0 with F -■ const., or 
&F — 0 with E — const., can be replaced by the ‘unconditioned’ varia¬ 


tional equation 


h{F/E) - 0 


(143b) 


which automatically provides for the normalization of the functions 
a c/ so far as the value of F is concerned. If, indeed, the a c - are not 
normalized, then the functions a c f>/E can be considered as their nor- 
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malized values and FjE as the value of F subject to the appropriate 
normalization conditions ^ = 1 or J a*.a c * dC n 1^. 

It is obvious from the comparison of (143 b) with (143) that the 
stationary values of FjE are just equal to the characteristic values F '— 
a fact which can be ascertained directly with the help of the trans¬ 
formation equations. Taking, for instance, F ~ ^ F C ' C »a* r a c »/E, 

then, since V F (rcr a c * ~ j F r a c >, we get F = F' y a*> a c: jE — F\ 
v* <7 

The variational principle which wo have just considered is a generaliza¬ 
tion of the variational principle for the energy, which was considered 
in the preceding chapter under the form SH --- 0 , with H -- J ip°*JJip 0 dV 
and E = J dV =- 1 . It reduces to the preceding form if 0° is 

replaced by the sum being the characteristic functions 

P 

of the operator C {x) which may be supposed to represent a Hamiltonian 
slightly different from that represented by the operator H c , more 
exactly, H (x) . 

This leads to a problem of the perturbation theory, which, from the 
geometrical point of view, outlined in this section, can be regarded as 
the problem of finding the symmetry axes of the quadric surface //, 
whose equation is referred to the symmetry axes of a slightly different 
quadric C. 

More generally we can say that from this geometrical point of view 
the quantum mechanics can be regarded as the analytical geometry of 
central quadric surfaces in the state-space; the symmetry axes of each 
such surface specify, by their length, the characteristic values of the 
physical quantity represented by this surface, and, by their direction, 
the associated states; while the cosines between the symmetry axes of 
two different surfaces represent the probability amplitudes for a certain 
value of one quantity (or set of three quantities) when the other 
quantity (or set of three quantities) is known to have a given value. 

In conclusion a few remarks should be added on the question of 
notations. Dirac and following him many other authors denote the 
elements of a matrix F c by the symbol (C'\F\C") which is equivalent 
to the symbol F% rc * used in this chapter, and which has the advantage 
of being closely connected with the symbol (F'\C') for the probability 
amplitudes Using Dirac’s notation, we can write the transforma¬ 

tion equations connecting the matrices Fn and F k in the following 

form: (*'|JP|A") = J J (K'\U’)(H’\F\H")(ir\K"), 
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if the spectrum of H is discrete, or 

(A'|A|A'") .-= JJ (j K'\H')dH' (J H'\F\H")dH" (Ii"\K"), 
if it is continuous. 

The index 0 in our notation serves to indicate that the time, which 
is supposed not to appear in the equations of this chapter, is ignored. 
We shall take it into account in a later section. 

Another remark refers to a type of vector notation applied by Dirac 
to vectors and tensors in the state-space and quite similar to that used 
i»i the ordinary three-dimensional vector and tensor analysis. 

A state-—-in the quantum-mechanical sense—is specified by a vector, 
‘ sa y> of unit length and of a definite direction in the state-space. The 
components of this vector with respect to a system of coordinates C 
may be denoted by 0 C .-. The same state can, however, be specified by 
the conjugate complex of 0, which is a vector 0* with the components 
«r- 

The sum 2 0*'0c' or the integral J 0*,0 cr dC f which is the measure 

of the square of the common length of the vectors 0* and 0 will be 
denoted as their ‘scalar product’ 0*0. In a similar way the scalar pro¬ 
duct of two different vectors 0 2 and 0 2 referring to two different states 
will be denoted by 0J 0 X or 0f 0 2 , which means, in the coordinate repre¬ 
sentation, ]£0*;'0 1C ' or ^ 0 f c *' 02 C" (the sums being again replaced by 

integrals in the case of a continuous O-spectrum). 

These expressions (which are conjugate complex with regard to each 
other) can be regarded, from the physical point of view, as the proba¬ 
bility amplitudes for the simultaneous occurrence of the two states 
(a measure of the ‘mutual compatibility’ of the latter). If these states 
are alternative (mutually exclusive), the vectors 0 A and 0 2 arc mutually 
orthogonal, which means that 0? ip l = 0* 0 2 = 0. 

Further, let F denote a tensor representing not a state, such as 0, 
but a certain physical quantity (an ‘observable’ or ‘dynamical variable’ 
according to Dirac), with the components F c . c ~ along the axes (~~ states) 
of C (we are dropping for convenience the superscript zero). The sum 
^F CX ,*0 C - (or integral J dC”) can be considered as the C- 

component of another vector, 0, say, specifying some state, in general 
different from 0. This vector will be called the product of the tensor F 
and the vector 0 and denoted by Fi/j [so that (Fi/j) c > = F cc ~ 0 C „]. 

The conjugate complex of 0 can be defined in a similar way as the 
product of F and 0* taken in the inverse order , i.e. by the formula 
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<£* ip*F, which means, in the coordinate representation, 

tir - (** A = I, inr (or f 4>t- Ac- dC"). 

This gives ( 

M - I ^ I Wvtv -II rc'*\y'c4c 

V c" 

(or JJ rc-F c -c4c'dCdC"), 

which will be denoted simply as ip*Fip. 

We get further (taking for the sake of simplicity the case of a discrete 
(7-spectrum) 

M - £ tirhr = £ Fc-v-Fvc-4v" 

or, since 2 F c . v . F v . c ... = ( F‘%. c -, 

we get — i/j*F 2 i/j. 

The preceding formula is the simplest example of a tensor product’. 
The product of two tensors F and G taken in the order stated is defined 
as a tensor with the components 

{FH)c”c" Fc'c'^c'c'" or J F C * C >G C ' C "'(IG . 

This definition of tensor multiplication is identical with the definition 
of matrix multiplication if F and G are considered not as tensors but 
as matrices. 

The matrix representation can also be applied to vectors such as ip 
if we generalize the conception of a matrix by admitting matrices which 
consist not of a square array of numbers (elements, components) but 
of a rectangular array (with a different number of rows and columns) 
and, in particular, of a linear array with one row or one column only. 
If we wish to preserve the general multiplication law, i.e. that the 
product of two matrices shall be a matrix obtained by combining the 
rows of the first factor with the columns of the second, we must repre¬ 
sent the vector ip and its conjugate complex 0* by linear matrices of 
different kinds, the one, considered as the first factor, consisting of one 
row only and the second of one column only. 

Taking the components of ip and ip* along the (7-axes as the elements 
of the matrices \p c and ip*, we shall put accordingly 

f fc -1 



and 
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which means that in multiplying two vectors or a vector and a tensor 
we must always start with the conjugate complex (ip*,<p*) and finish 
with the original ones. From the matrix point of view we should write 
ip* (adjoint matrix) instead of «//*, for the matrix i/j* defined above is 
obtained from the matrix tp not only by taking the conjugate complex 
of its elements, but also by an interchange of the rows and columns (cf. 
§ 16). With this convention the scalar product of two vector-matrices 
\p and <p can be written in the form i//</> or (p^ip, while the symbols 
ip<p^ or </»0 t have no meaning. Taking the components of *p^(p in the 
usual way, we get 

Wfa),,„, ^ 2 tic far,,., 

C 

which is equal to zero unless m -- 1 (first row of ^ f ) and n = 1 (first 
column of (f >). 

The product of a vector <// and a tensor F must be represented 
accordingly in either of the two forms F\p or ip^F, the former being 
a matrix of the same form as ip and the latter a matrix of the same 
form as The two matrices arc, of course, adjoint with regard to 
each other, so that we can write 

(fFy ^ Ftp, 

which is quite natural since F f — F (so long as F is a Hermitian 
matrix). 

It should be mentioned finally that the linear matrices with the 
elements ip iC > = i p c , can be replaced by ‘square’ matrices with the ele¬ 
ments \pQ’ C ' representing a set of vectors, which correspond to different 
values of Q f , or, in other words, the cosines between the directions Q' 
and C', Such matrices are not hermitian but unitary, i.e. satisfy the 
relation </r f = ip -1 ($^ = i/jq c >). The preceding formulae, relating to 
the products of the type <p^ip or Ftp, etc., remain valid with this inter¬ 
pretation of the ip } i.e. not as vectors specifying states, but as cosines 
between two sets of axes specifying two sets of states and measuring 
the probability amplitudes of their coexistence. The transformation 
equations <p° K > = can written accordingly in the form 

{ f J x'ii: a irK’> or <p — \pa (the order of the factors on the right 

side being opposite to that which corresponds to the product of *p 
considered as a vector with a matrix representing a tensor). 



PERTURBATION THEORY 

19. Perturbation Theory not involving the Time (Method of 
Stationary States) 

The exact determination of the wave functions = (x’\H') which 
specify the motion of a particle in a complicated field of force is usually 
impossible on account of analytical difficulties. But even if these diffi¬ 
culties could be overcome, it would hardly be possible to use the results, 
and especially to visualize them, on account of their complicated 
character. Thus both for mathematical and physical reasons it is 
desirable, in the case of a complicated field of force, to use an approxi¬ 
mative method of determining the functions i//°, starting with an exact 
determination of the latter for the motion in a simplified field of force, 
and introducing corrections to represent the effect of the ‘perturbing 
forces’, i.e. those forces which have been left out of account at the 
beginning. 

The energy operator corresponding to the ‘unperturbed’, i.e. simpli¬ 
fied, motion will be denoted by H (— H (x) ) and its characteristic func¬ 
tions by ift° ir (== *px'H')- The energy operator corresponding to the actual 
or ‘perturbed’ motion will be denoted by K (= K^) and its charac¬ 
teristic functions by <)P K . (— 4^’K')- 

The difference K—H — 8 will thus represent the additional or ‘per¬ 
turbation’ energy; it is usually defined as the potential energy of the 
perturbing forces. 

This perturbation energy must, of course, be regarded as ‘small’. 
The exact meaning of this condition will become apparent as we develop 
the problem by the method of the perturbation theory. 

As already mentioned, the perturbation theory (so far as H and K 
do not involve the time) amounts to a transformation of all physical 
quantities, considered as matrices, from the point of view of H to the 
point of view of K , which is supposed to-be but slightly different from 
H , so that the actual calculations can be carried out by means of the 
method of successive approximations. 

The principle of this method consists in regarding all quantities in¬ 
volving 8 , for instance the matrix elements 8 H . H *, as small quantities 
of the first order and splitting up the exact equations into a chain of 
approximate equations containing small quantities of the same order. 

We shall first assume that H has a discrete spectrum and that the 



178 PERTURBATION THEORY §19 

unpert/urbed motion is not degenerate , the characteristic values of H 
being thus sufficient for the complete specification of the corresponding 
states. 

The fundamental part of our problem will consist in the transforma¬ 
tion of the matrix K H to the diagonal form K k and in the determination 
of the transformation matrix a, according to the general equation 

K K = a'K H a (a r = a*- 1 ), (144) 

or Iv j£ cl = aK Kl (144 a) 

that is [cf. (123), § 16], 

^ a H m K"' ~ K ,r ajj'j£'». (144 b) 

Tr 


We must, first of all, fix the ‘zero approximation’ which corresponds 
to S = 0 , i.e. to the actual coincidence of K and H. Assuming the 
identical states to be labelled by the letters K or H with the same 
number of dashes ( K ' = H f , K n = H” , K m — etc.), we can put, in 
this case, 


a = 8 , 


that is, 


l H'K" 


U H'K" 


(145) 

elements 


where 5 is the mixed unit matrix with the diagonal 
8j WK , — $h"K" = 1 ( a ^ others being equal to zero). 

Equations (144b) reduce, in this case, to 

that is, to K 0 h'H' = (145 a) 

which is the same thing as K' — H' , since K H , H . — H H H ’ — H'. 

We shall now consider the actual case in which S 0, assuming that 
there still exists in this case a one-to-one correspondence between the 
unperturbed states H', and the perturbed states KK", K 

—in the sense that the states labelled by the letter K or H with the 
same number of dashes coincide with each other when the perturbation 
energy 8 tends to zero. 

We shall put accordingly 

Kkk' = K> = H'+AH', (146) 

where AH' denotes the change of the energy-levels due to the perturba- 

’ a = 8-f Aa, i.e. (146a) 

the corrections Aa H . K > being assumed to be small (compared with 1). 
We have further 

K h = H h +S h , i.e. 


(146 b) 
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Substituting these expressions in equations (144 b) and taking into 
account that H° H ’ n » = H'S^x*, we get 

= (H m + A/r)(8 ir ^+Aa H ^). 

Since — 0 unless H” — K'" when it is equal to 1, and 

= 0 both when K m = K' (because then H '" = H') and 
when K'" ^ K' y we get 

$«'#'"+ 2 , Sh'h 0 & a jrK''' 

— AH w (8 H .^>/+Aa^^-) + (^ w — H')Aa H ' K ,». (147) 

These equations can be solved by successive approximations, if we 
assume that the quantities S H > ir > (i.e. the matrix elements of the per¬ 
turbation energy ‘from the point of view’ of the unperturbed energy) 
are small quantities of the same (first) order of magnitude and expand 
AH' and Aa in series of the form 

AH' = A 1 H'+A 2 H'+... 

A (i = Aj a-j- A 2 a-\~ ... 

where A n H' and A n a are corrections of the wth order (that is, of the 
same order of magnitude as the nth power of the elements of S n ). 

Substituting (147 a) in (147) and dropping terms of the second and 
higher orders of magnitude, we obtain as a first approximation the 
equations 

S° H 'n>» = A 1 H" r $j£' K ’»~\-(H" , —H')A 1 a H ' K »'. (148) 

Putting K'" — K ' (and consequently H m — H'), we get 

A X H' = S ° H , H (148 a) 

This formula determines, to the first approximation, the change of the 
energy-levels produced by the perturbation. 

If K"' is different from K' (and consequently IT" is different from 
H') y equation (148) reduces to 

SW"-(H"'-H') A iajrK . n , 

S° , „ 

that is, A x a H ' K “ > = — (148b) 

giving the first-order expressions for the transformation coefficients 

a H'K ,n • 

If we preserve in (147) terms of the second order, dropping terras of 
the third and higher orders, and take account of the first-order equations 


(147 a) 
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(148), we get the second-order equations: 

^ S°H'H " ^1 a H'K 

= A 2 H m h H ' K "'+k l H m ^ x a H , K ,"+ (H"'—H')h 2 (149) 

It should be remarked that these equations, as well as the equations 
of the succeeding orders, can be obtained from (147) by substituting 
the expressions (147 a) and dropping all terms with the exception of 
those of the order in question. 

Putting K = K' (and H'" = //') in (149), we get 

^ ®H’H” + \H'&l a IVK> 

IT 

or, on account of the relation (148a), 


A 2 H' — ^ 

//V//' 

Substituting the expressions (148 b) with K f " replaced by K f and H' by 




\8h-ir 


(149a) 


H ”, we get 

A IJ' — _ x - nn ~ n il _ x ________ 

With K’" different from K', equation (149) reduces to 
ir 

giving, with the help of (148 a) and (148 b), the following expression for 
the second-order correction in the coefficients a: 


\ a H'K ' 


__ V Sin r 




or 


IV *H 

^2 a IVK" f = ^ 


®H'H ^ 

') (H'-H'y 9 


8H'H'' Sh''H'" 


In carrying out the summation over H" we must drop the term H 
(as well as H* = H ') because the formula 


(149b) 
' = H m 


Aj a 


IVK" 


^FVIV" 


holds for the case R* ^ //"' only, while for //" — H m we have 

Ai = 0. (150) 

This equation can be obtained from the normalization condition which 
must be satisfied by the matrix a, namely, 

2 a H'K' a H'K* — 

Putting = S/rjr+Aa/Tir, then since 8#^* = 1 when H f = H* 
and 0 when H f H\ we get 

^ a H’K* & a H'K” "i“ ^ a H'K' & a H'K” = 


(150a) 





§ 19 PERTURBATION THEORY NOT INVOLVING THE TIME 181 

whence it follows that 

Ai a irA'+Ai 0/rir = 0- 

Since the diagonal elements of the matrix a must be real = aH 0 jr)> 

we have * n 

Aj a irK" 0- 

The formula (149) likewise leaves undetermined the diagonal elements 
of Ag a H ' K ,". They can be determined, however, with the help of the 
equation (J 50 a) or rather the equation 

2A2 a /rA"’+ 2 Ai a H’K” 

//' 

which is obtained from it as a second approximation (dropping all 
terms save those of the second order) and which, in conjunction with 
(148 b), gives 

(150 b) 


Ag 


2-77T 

ir*ir\ u 


&H-U" I 


-H’Y 


The formula (150) follows in a quite obvious manner from the geo¬ 
metrical interpretation of the coefficients a rrK . as the cosines of the 
angles between the symmetry axes of the quadric surfaces representing 
(in the state-space) the energy H and the energy K. Since, by defini¬ 
tion, H and K must differ very little from each other, the corresponding 
axes IV and K f (or //" and K ", etc.) must have approximately the same 
direction, while the non-corresponding axes (//' and K ") must be nearly 
perpendicular to each other. Demoting the angle between //' and K f 
by oi jrK ’ and considering it as a small quantity of the first order, we get 

Kr*") 2 


Cl jI'K' — COS 0L 1I K > 1- 




which means that the first-order correction A 2 o jrK > vanishes, while 
& 2 a H'K' ~ Comparing this with (150b), we can put 

(<W 2 = 2 (151) 

//'///' ' 

This formula shows that the angles between the corresponding sym¬ 
metry axes of H and K are of the same order of magnitude as the 
ratios of the matrix elements of the perturbation energy 8 with respect 
to different //-states to the difference between the characteristic values 
of H for these states. 

The same result, in a still simpler form, is obtained from a considera¬ 
tion of the first approximation values of the coefficients a IVK . — A a H K » 
(K n ^ K'). Putting a H K * — cos ol ]VK » and ql H ' K - = 47r+Aa /rir , where 
Aa h k* denotes a small angle, we get 

a n K* ~ sin Ax#.#* ~ A 1 oca'#*. 
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whence, according to (148 b), 

A x ol H ' K . = (151a) 

This angle should not be confused with the angle through which the 
axis H” has to be rotated in order to coincide with K” and which is 
equal to — Aoi lrK *. The comparison of equations (151) and 

(151 a) shows that the latter angle can be regarded as the (geometrical) 
sum of mutually perpendicular angular displacements of the type 
A(x H k' for different values of H r (^ //"). In other words, the angular 
displacement A ol h > k » can be considered as the component along the 
H '-axis of the elementary rotation a H ~ K *. We thus obtain the law of 
the vector composition of elementary rotations about different (mutually 
perpendicular) axes, which is a generalization of the corresponding law 
for ordinary three-dimensional space. 

In the latter case, an infinitesimal rotation of the coordinate system 
can be specified by a certain vector co, which determines the (apparent) 
change of a fixed vector r by means of the formula Ar — — wxr. So 
far as the first approximation is concerned, the components of co and Ar 
along the old and new axes can be identified with each other. Written in 
components along the old axes, the preceding formula gives the fol¬ 
lowing equations: 

A#i = £i~x 1 == —co 2 x 3 +co B x 2 

A &2 ~ ^2 ^2 “ ^3 #3 

A #3 = 2*3 — ^*1 

which can be considered as a particular or rather as a limiting case of 
an orthogonal transformation for the case when the two systems (S and 
H) differ very little from each other. Putting 

Oil = a 23 = — <*32> W 2 ~ «3X = “" a 13» W 3 = a 12 = ~ <*21> 

we can rewrite the preceding equations in the form 

= ~ 2 “nvV- (152) 

Comparing equations (152) with the exact transformation equations 

n* 

we see that they can be obtained from the latter if we put 

” $n'n* a n'n'» £v' ^ 

where v and n' denote corresponding axes of the new and old system, 
i.e. such axes as were initially coincident. The angles a nV = <x n . n > 
must approximately vanish for the normalizing and orthogonality 
relations to be satisfied. 
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We thus see that an infinitesimal orthogonal transformation in 
ordinary space can be treated as an infinitesimal rotation of the original 
coordinate systems, specified both with regard to the direction of the 
rotation axis and the angle of rotation about it by the (infinitely small) 
vector to with the components a> v w 2 , cv 3 , or by the ‘antisymmetrical 
tensor’ a with the components a n , M . = — ot n . n ', referred to the original 
axes. 

These results can easily be extended to the infinitesimal orthogonal 
transformations in the ‘state-space’, corresponding to a transition from 
the symmetry axes of the quadric surface representing the unperturbed 
energy H , to the symmetry axes of the quadric representing the per¬ 
turbed energy K = H+S. 

Leaving the perturbation energy S unspecified, we can represent 
the (apparent) change of the components of any vector if* due to the 
small rotation of the coordinate axes by an equation wholly similar to 
(152), namely, ^ = _ (152a) 

where a denotes an ‘anti-Hermitian’ tensor (which is a generalization 
of the antisymmetric one) satisfying the condition 

a jTH m — ~ oc *rir> (152 b) 

or a 1 = —a. 

These results can be obtained in the same way as in the three-dimen¬ 
sional case from the exact transformation equation, 

fAlT " % a H”K' 
fr 

by putting \fj K > = A«/» /r and a H » K , — where the a 

denote small quantities of the first order. Substituting the latter 
expressions in the orthogonality and normalizing conditions, 

a H”K• a H"'K’ ~ 

and neglecting second-order terms, we get, if the summation index A r ' 
is replaced by H\ 

^ OL H 0 H t 

that is, ~ 0> 

which is equivalent to (152 b). 

As a matter of fact, from (148 b) and because ol h * h * = 
we have oo 

a H"K' ~ 

so that the condition (152 b) is actually satisfied. 


(152 c) 
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It should be mentioned that in the case of a generalized space with 
more than three dimensions an antisymmetrical tensor is no longer 
equivalent to a vector, f It is therefore impossible to represent the 
rotation of the quadric surface H into such a position that its axes 
coincide (in direction but not in length!) with those of the quadric K 
by means of a vector corresponding to o>, or to specify the rotation 
by its components along the different axes of H. Instead of using 
the coordinate axes, we can, however, use for the same purpose the 
coordinate 'planes (in the case of ordinary space the number of these 
planes is equal to the number of axes, which explains the possibility 
of representing the former by the latter). The quantities a H * IV can be 
interpreted as the projections of the rotation H -> K on the planes 
( H ", H'). The angle through which H* must be rotated to coincide with 
K n is given by the equation 

ctfi-K* = 2 l a /r/rl 2 > 
iv 

which is similar to the ordinary equation for the composition of ele¬ 
mentary rotations considered as vectors (for instance, w 2 — a^-fV^+o'i) 
because in the preceding equation one of the axes ( H") remains fixed 
and the summation over the different planes passing through it is 
equivalent to a summation over all the axes different from H”. 

The expressions (152 c) for the elementary rotations, as well as 
the corresponding (first-order) corrections for the energy values 
AH' = K'—H', can be obtained in a somewhat simpler way than before 
by starting from the expressions (152 a) and using the equations 
H*p H > = H'^n and Kj> K , == 

Putting in the latter equation <f> K > = K' = H'+AH' } and 

K — H+ * S’, we have 

Hf H .+St Jr +HA+ H '+SA+ H , - H'iI> h .+AH'+ h .+H'A+ H '+AH'A+ h ,, 
or dropping terms of the second order of smallness (i.e. the products 
SA\/f H ‘ and AH'Atp'): 

Sf jr +HA*I; H , = A H'i/t H '+H'A*p H '. (153) 

Now by the definition of matrix elements we have 

On the other hand we get, according to (152a), 

HAlftjf* = ~~ ~ ~~ 

t If n is the number of dimensions, then the number of different non-vanishing 
components of an antisymmetric tensor is equal to §-«(n~ 1), which is equal to the 
number (n) of components of a vector only when n » 3, 
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Thus (153) can be written in the form 

2 ( H a irn’ )^h " — A H 'I'h — H’ 2 a JEf'/r 4 , h*> 

ir h - 

or 


r^Sf*//— (H n —— A H'ip H . — J A//'^ /r . (153 a) 

/r ir 

Equating the coefficients of ip H » on both sides, we get 

= |, (153 b) 


in agreement with the results previously found. 

The fact that equation (153 a) splits up into equations (153 b) for 
the coefficients of the separate t p jr . is due, as already pointed out, 
(Part T, § 18), to the mutual orthogonality of the functions ip H ~ (as 
functions of the coordinates x , y , z). If we have an equation of the type 

y, — y h H ^/r which holds identically (i.e. for all values of 
h IT 

x , y , z), then multiplying it by ipf r and integrating over x , y , 2 , we get 
a H . = b ir , all the other terms vanishing. 

We have assumed, hitherto, that the unperturbed problem was ‘non- 
degenerate’, i.e. that all the characteristic values of H were different. 
The essential character of this assumption is clearly seen from the fact 


that the equations a jrjr 


°irH’ 

H*-H' 


become meaningless (unless S^ rH 


vanishes) when H " — //', while the two states and ifj H > remain 
different. It is, moreover, impossible to specify the different states, as 
has been done so far, by the value of the energy alone. We shall there¬ 
fore add to it some other quantity 6', which commutes with it (i.e. 
represents a constant of the motion) and which can be supposed to have 
different values for different states which have the same energy. 

The alterations in the treatment of a perturbation problem which 
are necessitated by the presence of degeneracy in the unperturbed 
problem can best be understood with the help of the geometrical inter¬ 
pretation. If the energy H is represented as a quadric surface in the 
state-space, with symmetry axes whose lengths are inversely proportional 
to the corresponding characteristic values of H, then degeneracy means 
that a few of these axes have the same length, the corresponding section 
of the surface, comprising all the equal axes, being ‘circle-like’ A de¬ 
generacy of this sort is met with in ordinary analytical geometry in the 
case of an ellipsoid with two or three equal axes, the ellipsoid degenerat¬ 
ing into a spheroid or into a sphere. 

3M6.6 B b 
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So long as the surface is not degenerate, the directions of its symmetry 
axes are perfectly definite. Degeneracy involves an arbitrariness in the 
choice of the symmetry axes within the ‘circle-like’ section, any ortho¬ 
gonal system of axes being appropriate. It may be mentioned that this 
corresponds to the .physical indeterminateness of the corresponding 
states and to the necessity of specifying them with the help of some 
other quantity, C say, which can also be imagined to be represented 
by a certain quadric surface. The commutability of H and C means, 
as we know, that the symmetry axes of the corresponding surfaces have 
the same directions; if one of them has a ‘circular’ section its axes 
within this section can be identified with those of the other. 

Let us assume that the surface representing the energy K of the 
perturbed motion is non-degenerate. We shall then find two types of 
relations between its symmetry axes and those of H. So long as the 
latter are intrinsically determined—i.e. apart from the circular sections 
—the axes K' must differ but very little from the corresponding axes 
//', as has been supposed hitherto. So far, however, as a set of equal 
//-axes is concerned, a set contained within a circular section and fixed 
more or less arbitrarily, the angles between them and the set of i£-axes 
corresponding to this section need not be small. The process of successive 
approximations, which was based on the assumption that all the angles 
ochk" were sma N> must therefore, in general, lead to wrong results. 
That it does lead to wrong results is clear from the formula (152 c) 
which gives an infinitely large value for oc lrjr if the difference H”—H' 
(for two different states) vanishes, unless S° H . ir also vanishes. 

It is thus clear that before starting on the process of successive 
approximations based upon the assumption of the smallness of the 
angles, one must make them actually small by transforming the sets 
of axes which refer to ‘circular’ sections in such a way that they 
approximately coincide with the corresponding set of K- axes, l This 
‘preliminary’ or zero-order transformation can be carried out for each 
circular section independently, i.e. by dropping from the general equa¬ 
tion of the if-quadric, or rather from the equations of the K H K k 
transformation, all the terms which connect different circular sections 
with each other (or with individual axes, if any). In fact the trans¬ 
formation coefficients and a H * K »* 9 where //' and H” refer to one 

circular section and K m to another ‘nearly’ circular section, must be 
very small of the first order (the two sections being ‘nearly’ perpendi¬ 
cular to each other) and can therefore be neglected compared with the 
coefficients a U K * or a WK ^ where K” refers to the nearly circular section 
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of K which approximately coincides with the circular section of H 
containing the axes IV and //". 

It will be convenient to alter our previous notation and to denote 
the r' axes of a circular section corresponding to the value H = H' by 
C[ y C Cy. The r* axes of the corresponding nearly circular section 
of K will be denoted accordingly by K[ , A"',..., K f r There is, in general, 
no one-to-one correspondence between these r' A'-axes and the r' 
C'- axes. They form two different orthogonal systems and the pre¬ 
liminary transformation which we are looking for is precisely the 
transformation C* -> K' carried out for each circular section separately. 

The exact equations of the transformation II -> K are thus split up 
into a set of ‘zero-order’ equations of the following form: 



K' 


- K'a { 


C' n K’ 


(154) 


where m = 1, 2,3,..., r'. 

For each of the ‘multiple’ values of II corresponding to r' different 
states, we thus get a system of r' linear homogeneous equations involving 
states of this set only. These equations are quite similar to the general 
transformation equations for the case of no degeneracy, 

X Kn'H* a H'K‘ f ’ — K" a H'K''i 

Tr 


differing from them solely by the fact that they refer to a finite number 
of states—a fact which makes it possible to solve them exactly without 
the use of the method of successive approximation (whose application 
has to be postponed). 

Putting K = H+S and K ' — H'+j\H f in (154), then since 
H<? m Cn = we get 

2 mC', a c'.K' — kH'a, c ^ K ’. (154 a) 

11 — 1 


For the sake of simplicity, we shall rewrite this equation, or rather 
the set of r' equations, in the form 

I = &H'a in , (154 b) 

n~l 

where m is an abbreviation for C ^ and the index K' is dropped. Their 
compatibility condition 


-AH' 

S? 2 • • 

• s?,- 



Sh 

Slz-AH' . . 

• si. 

= 0 

(154 c) 

S°rt 

S°r* ■ ■ 

. S° rY -AH' 
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gives r' values for the ‘additional’ energy A //', which are, in general, 
different from each other. This is expressed by saying that the per¬ 
turbation splits up each multiple energy-level H' into a number (r') of 
different sub-levels K' = II'+AH [, H'+AH! Z ,..., H'+AH' r . 

To each value of AH ', A H' s say, there corresponds a set of values 
of the r' coefficients a n : 

®ns “ *•*> 

As in the general case, each of these sets must be normalized to 1, the 
different sets being orthogonal to each other. We thus get for each 
r'-fold value of the unperturbed energy H f a unitary transformation 
matrix a of order r', which serves to transform the original r' functions 
‘/'ey •••» associated with the energy-level //' into new functions 
*Aa'> •••» ^k'r'y associated with the different sub-levels into which 
these levels are split up. Using the one-row matrix notation for the 
two sets of functions,we can write the relation between them in the 

^ orm 0' = aifj or 0 ,f = 0 t a t . 

The preceding results are identical with those obtained in Chap. II, § 9, 
by means of the variational method. 

It should be understood that the functions 0' do not represent a sot 
of A-states, but another degenerate set of //-states which only approxi¬ 
mate to the corresponding ZT-states. Starting with these functions, it 
is possible, in the usual way, to obtain higher approximations. It is 
important to note that the first approximation values for the energy 
are determined, according to (154 c), in conjunction with the ‘zero 
approximation’ for the characteristic functions. 

It can easily be shown that the //-states specified by the new func¬ 
tions i/j' are such that the matrix of the perturbation energy S with 
respect to them is diagonal. This follows from the fact that equations 
(154 a) are of the same form as the equations for the transformation of 
the matrix K n to the diagonal form K k , K being replaced by S, K' by 
AH', and the whole quadric K by its ‘nearly circular’ section. Denoting 
the transformed matrix of the perturbation energy (for the r' states ip') 
by S’, we have s > = a _ iga = a f Sa 

The diagonal elements of S' are equal to the values of AH' for the 
corresponding states, so that we can put 

= AI1' 8 , 

which is exactly of the same form as equation (148 a), referring to the 
case in which there is no degeneracy. 
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These equations have a very simple physical meaning, which can 
be expressed by saying that the additional energy due to perturbing 
forces is equal , in the first approximation , to the average value of the 
perturbation energy 8 for the unperturbed motion .f When there is no 
degeneracy, the latter is specified unambiguously by a function ipjp 
referring to one definite state. In the presence of degeneracy these 
unperturbed states have to be defined by means of the preliminary 
transformation, and are, in general, different from the original states. 

We are now in a position to formulate the conditions under which 
a i>erturbation can be treated as weak. This weakness must obviously 
correspond to the smallness of the angles between the symmetry axes 
of the surfaces K and H and also to a smallness of the difference 
between the lengths of these axes. The ‘circular’ sections of II corre¬ 
sponding to degeneracy need not be taken into account, since the 
directions of the axes lying within them remains arbitrary and can 
always be adjusted to be close to those of the corresponding section 
of K. 

Now we have seen that, to a first approximation, the angles a /rA -» are 
equal to 8 Q iriv j(H '—//") and the differences K k > k ,—H jrir K'—H' 
are equal to 8° jrir . It follows from this that the perturbation can be 
considered as weak if the matrix elements of the perturbation energy 
8 with respect to different values of H are small compared with the 
difference between these values, and the diagonal elements are small 
compared with the corresponding values of H. 

The smallness of 8 in this sense does not exclude the possibility that 
8 , considered as a function of the coordinates of the particle (i.e. in 
the classical sense), should become very large and even infinite at certain 
points or regions. This makes the range of applicability of the wave- 
mechanical perturbation theory infinitely broader than that of the 
classical mechanics, which is restricted by the condition that 8 should 
be small compared with H' at all points of the unperturbed path. 

20. Extension of the Preceding Theory to the Case of ‘Relative 
Degeneracy’ and Continuous Spectra; Effect of Perturbation 
on Various Physical Quantities. 

In many non-degenerate problems we meet with the case of a perturba¬ 
tion which cannot be described as weak—in the above sense—with 

t It should be mentioned that the same result holds in the perturbation theory of 
classical mechanics, the average value of S being defined here as the average value with 
respect to the time. 
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regard to pairs of (unperturbed) states belonging to certain sets, while 
it remains weak with regard to pairs of states belonging to different 
sets. This means that the matrix elements of S with respect to the 
different states of the same set are large—or at least not small—com¬ 
pared with the energy differences between these states, while the matrix 
elements of S with respect to states belonging to any two different sets 
are small compared with the corresponding energy differences. In the 
limiting case when the energy differences between the states of the same 
set vanish, we get back to the ‘degenerate’ problem considered before. 
It is plain, however, that the same method can be applied approxi¬ 
mately when these energy-differences do not exactly vanish but are 
small compared with the corresponding matrix elements of S, so that 
without sensible error the (unperturbed) energies of the states in ques¬ 
tion can be identified with each other. 

This serves to show that the notion of ‘degeneracy’ can be visualized 
as a relative one, from the point of view of the perturbation energy 
S which we are interested in, the ‘absolute’ degeneracy which has been 
considered hitherto forming but the limiting case of this relative 
degeneracy. If, for instance, S contains a continuously variable para¬ 
meter (an electric or magnetic field, say), we can pass, by Rteadily 
increasing it, from a practically non-degenerate problem to a practically 
degenerate one, the degeneracy extending over certain sets of states 
whose energy-differences become small, as S increases, with respect to 
the corresponding matrix elements of S, while the matrix elements of 
the same function remain small compared with the energy-differences 
between states of different sets.f 

We shall assume that such a subdivision of the various unperturbed 
states into relatively narrow sets, which lie wide apart from each other 
on the energy scale, is possible, and shall denote these states as multi - 
plets. When the perturbation energy (defined by the value of its matrix 
elements with respect to the corresponding states) is small compared 
with the distance between the different multiplets and not small 
(without necessarily being large) compared with the ‘widths’ of the 
separate multiplets, the perturbation theory given in the preceding 
section is no longer applicable, and must be replaced by a more general 
method. 

This generalized perturbation method (which has been pointed out 
by Lennard-Jones and by Jones) is extremely simple and consists in 

f A typical example of this condition is found in the transition from a weak to a strong 
magnetic field in the theory of the Zeeman effect (or Paachen-Back effect). 
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splitting up the exact system of the transformation equations 

^ Kji'n" a u"K'" — K'" a irK'" 

into a number of approximate systems, referring to the separate multi - 
plets and obtained from the above equations by confining the summation 
over H N for each value of H' to such states only as belong to the same 
multifilet as H'. 

This is exactly what we have done before in writing down the equa¬ 
tions (154) which refer to the limiting case of absolute degeneracy. 
They are applicable, however, just as well to the more general case of 
a relative degeneracy if the letters C[ f C ! Zi ..., C are used to denote the 
states of the same ‘multiplet’, with energy-values H\ , # 2 ,..., H' r . lying 
close to a certain value //' and far away from the energy values, speci¬ 
fying all the other unperturbed states. To prove this we need but 
note the fact that the matrix elements of the total energy K with 
respect to states of different sets are relatively small and can there¬ 
fore be neglected compared with those which refer to the same set 
(multiplet). 

In the geometrical representation of the unperturbed and the per¬ 
turbed states as the axes of the quadric surfaces H and K in the state- 
space, a multiplet corresponds to a ‘nearly’ circular section of the 
former. So long as each such section is nearly parallel to a certain also 
nearly circular section of the A r -surface, we have to deal with a per¬ 
turbation which can be considered as weak with regard to the different 
multiplets. It can be, however, at the same time strong with regard to 
the states of the same multiplet, if the symmetry axes of the corre¬ 
sponding nearly circular sections of H and K have entirely different 
directions, A one-to-one correspondence between the unperturbed 
states of each multiplet and the perturbed ones cannot be traced in this 
case, just as in the case of an absolute degeneracy. The difference 
between the two cases lies only in the fact that in the former case the 
unperturbed states are fixed unambiguously, while in the latter they 
are represented by a perfectly arbitrary set of mutually perpendicular 
axes in the corresponding exactly circular section of the quadric H. 

As has just been mentioned, the equations (154) still hold for the 
case of the ‘relative degeneracy’ if the letters Cy serve to distinguish 

the states of a multiplet belonging to neighbouring values of the energy 
JOT',. The equations (154a) or (154b) are, however, not applicable 
to the general case, for we must take into account the differences 
between the various ‘sub-levels’ H' n (n = To do this we need 
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only replace AH’ in (154 a) by A H' m — K'—H' rn , which gives, in the 
notation of (154 b), r , 

2 - AH' m a m (155) 

n -1 

or f a n = (AH’-Al\)a m , (155 a) 

n-i 

where AH' = K'-H' and AH' m = H'—H' m} 

H' denoting some average of the r' values #i, H r . The com¬ 

patibility condition of the equations (155 a) 


S° n +AH[-AH’ 

St* 

Sir- 


S?,„+AH’ 2 -AH'. 

Sir- 


S°,2 

. S a r y+AH' r .-AH' 


differs from (154 c) by the additional terms A H' m in the diagonal ele¬ 
ments of the determinant, and leads as before to r' (in general different) 
values of the perturbed energy K' — H'+AH'. If the non-diagonal 
terms of the determinant are sufficiently small it reduces to the product 
of the diagonal terms leading to the expressions AH' — S* n +AH' n or 
A H n = S% n which have been obtained in the preceding section for the 
case of no degeneracy. If, on the contrary, the terms A H' m or rather 
H^—Hn are small compared with /S^ n , equation (155 b) practically 
reduces to the equation (154 c) for the case of complete (absolute) 
degeneracy. 

We have hitherto assumed that the wave functions ifj jr specifying 
the unperturbed states are orthogonal with respect to each other. The 
above theory can easily be extended to the case when the orthogonality 
condition is not fulfilled. We need not, however, consider this case in 
detail here, for it has been dealt with already in § 9 of Chap. II by the 
variational method. The results embodied in the equations (61) are 
a generalization of the equations (154), which differ from the (special¬ 
ized) equations (62) in the notation only. 

It should be mentioned that to the states defined by non-orthogonal 
wave functions there correspond in the state-space a system of non- 
orthogonal axes to which the energy quadrics H and K are referred. 
The non-orthogonality of these axes means physically that the corre¬ 
sponding states are not mutually excluded, the integral J 'l>H , 4 i H"dV 
measuring in fact the probability of one of them when the other is 
supposed to be realized. 

So far we have dealt only with the case in which the unperturbed 
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motion has a discrete energy spectrum (which corresponds, classically, to 
its being confined to a limited region of space). The case of a continuous 
//-spectrum could be treated on similar lines. It is, however, meaning¬ 
less to determine the change A II of the energy-levels produced by the 
perturbation, when these levels form a continuous series. Thus one of 
the main problems of the perturbation theory relating to the case of 
discrete H -spectra, together with the complications arising in connexion 
with degeneracy, drops out. The other problem—that of the deter¬ 
mination of the change A tp of the wave functions specifying the 
stationary states—can be solved in the same way as before, i.e. by 
determining the transformation coefficients In the present case 

the zero approximation is given by the formula 

a**- = m'-H"), 

instead of a H K * — Instead of equation (144 b), we have 

/ K° u , H .a H . K ...dH‘" = K'" a WK .... 


Putting a H . K - 
we get 


— h(H "and K — then since 

= H'W-H'), 


// / [§(//'-“-// w )+A« i/ ^^]+^ rif »/+ J S'ji’ji*k a n*K'" dH” 




which, with K"' = //'"+A//'", can be written in the form 

Sirn"'+ j ka H * K n.dH” 

= A//l$(/T-/7''0+A^^ 

This method can be conveniently applied only when the quantities 
A a H K'" are k nown to be small—a condition which is, in general, not 
satisfied. 

An alternative method consists in the direct determination of the 
change of the functions ifj jr , Awhich is produced by the perturba¬ 
tion, without the use of the integral representation 

Ai p H ' = j 4*ir dH n 

(where A a irH . = AThis can be done with the help of the 
equation (H+S -K')^ h ,+^ h .) = 0, 


which can be written in the form 

(. H-K')&4 T = tyj,.) (156) 

and which differs from the approximate equation (153) by leaving 

3595*6 ^ q 
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‘ unsplit * the energy K' of the perturbed motion and by preserving the 
small term SA*/j'. Dropping it, we get the equation of the first approxi- 
mation: (156a) 


Substituting on the right side the nth-order correction A n ip H > for 
we get the equation for the correction of the (n-f- l)th order, 

= -SA n + H ,, (156b) 

the exact function Abeing thus defined as the limit of the 

8er ^ eS . ••• • 

This method has been worked out by Bom in connexion with collision 
problems (see Part III). It can be applied also to the case of 
discrete spectra (thus enabling one to avoid the determination of the 
transformation coefficients a)\ but in this case it must be modified by 
putting K' = H'+AH' — H'+A 1 H'+A 2 H'+..., which leads to the 
equations 




(H-H')A n+1 t H , 


• (157) 


= —(S—A 1 H')A n *lt H '-\-(A 2 H')A lt . 1 ili H ’-{-...-{-(A n 4 1 IJ')ip I{ > j 
The problem becomes more complicated, for we must determine not 
only the functions A 1 A 2 if/ JV) etc., but at the same time the numbers 
Aj H\ A 2 . This can be done with the help of the so-called ortho¬ 
gonality property of the non-homogeneous linear equations of the form 




(157 a) 


This ‘orthogonality’ consists in the following: Multiplying the preceding 
equation by the solution of the corresponding homogeneous equation 
— 0, or its conjugate complex and integrating, we get, 
in view of the self-adjointness of the operator H> 


/ </'*/• 

and consequently 


(H-H')x dV = j dV = 0, 

J fif^r dV — 0 . 


(157 b) 


Applying this ‘orthogonality property’ to the first of equations (157), we 
S® 1 A, H' J p B . dV = j p H . S+ H . dV, 


that is, A 1 H* 

way 


A*H f 


Applying it to the second, we get in a similar 
= jp u .(S-A 1 H')\ 1 t 1J .dV, 
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which can easily be evaluated after A x ip H ' has been determined from the 
first of equations (157). This process can be prolonged as far as one 
may desire, the determination of A n H' always preceding by one step 
that of A n ^ B >. 

If (157 a) is multiplied by instead of we obtain, on integration, 
(H"-H')jtf rX dV = jfa.fdV. (157c) 


This gives, if applied to the first of equations (157), 

J 0 /r Ax^' dV 


&H'H' 

H*-H' 


i.e. the expression for the coefficient A iUh'k - This is quite natural, for 
if we put Aj tp H . — y A iO'h-k'^h^ then, in view of the orthogonality 

R" 

of the functions and \p u . y we get J dV — A 

The preceding results obviously hold for the case only when the 
unperturbed problem is not degenerate, and must be modified if there 
is degeneracy—either absolute or relative. 

We shall, however, leave that case aside and shall briefly examine 
the approximate effect produced by the perturbation on any physical 
quantity F described as a matrix, from the point of view of H in the 
case of the unperturbed motion and that of K in that of the perturbed 
one. This can be readily done after we have succeeded in determining 
the supposedly small quantities A a H K » or A\p ir . Putting 

^K'K — Fr'H 0 = AFr'h*’ 
we have AF° irjr = (a^Fa—Ff H > H ^ 

or, since a — 8+Aa and &F = FS = F, 


AF 0 jrH . = (FAa+Aa^Ff^^+^a^FAa)^^. 

This gives, to the first order of approximation, 

Ai Fhh* = (F A x a-\-A x a^ F)° h > h ^ 

or in the case of a discrete if-spectrum (with no degeneracy or a de¬ 
generacy accounted for by a preliminary transformation), according to 

(148b). F° ir H'"S*H‘"H" . V F° H ' h * . 

* h’h* — 2^ H n —H f7f ^ 2* — r/7 "— ’ v 105 ; 

R 

since A= &\ a H’"H’ = —&i a ww 


81 th 


H'-H 


H'-H m ’ 

. Putting H" = H' 


and writing H* for H"' we obtain, in particular, 

A VO _ V FH'H m &H*H'+SH'H”Ff f ‘' H ' 

A - Z H'-H* * 


(158a) 
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This formula determines the change of the average or probable values 
of F for the different unperturbed states as compared with the corre¬ 
sponding perturbed states. Putting F — S 9 we get 



9 V 


Comparing this with (149 a), we obtain the following relation between the 
second-order correction for the energy and the first-order correction 
ioTS° n . H .: A 2 tf' = |A t £»,'//■• (158 b) 

This formula is quite similar to 

A 1 H t = 

and can be further generalized with the result 
A n H' = 

if higher-order corrections for the matrix elements are taken into con¬ 
sideration, according to (157 a). We shall not, however, consider in 
detail this question which can easily be solved by substituting in (157 a) 
the expressions Aa = A 1 a+A 2 a-f.... 

For the sake of illustration we shall apply the preceding equations 
to the case of a hydrogen-like atom, perturbed by a homogeneous 
electric field E parallel to the z-axis. We have in this case S = —eEx, 
where x is the coordinate of the electron with respect to the nucleus. 
Putting in (158 a) F — ex, we obtain the expression for the additional 
electric moment induced by the field when the atom is supposed to 
remain in the (non-degenerate) unperturbed state H 

ex B . a . = 2 e*E J = *B, (158c) 

H." 

where a is the polarization (or susceptibility) coefficient. The corre¬ 
sponding energy must obviously be equal to Ja E 2 = S° irH < which 
is in agreement with the relation (158 b) since the ehergy in question 
corresponds to the second-order correction (A 2 H')« 

The same results are obtained, of course, if instead of the transforma¬ 
tion coefficients the transformed functions ip, or rather the corrections 
Atp, are used. Limiting ourselves to the first approximation, we get 

Fk'K* — J dV 

S J K’ F^h- dV + j F<Ph’ dV + J p H . FA, + B . dV, 
F°, rn - — J A,</i* r Fip lr dV + J >l> B -FA,<p H -dV. ( 159 ) 


that is, 
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These expressions can be used in the case of continuous //-spectra when 
the functions Aj^r#* are determined directly by Bom’s method. If they 
are determined with the help of the transformation coefficients, we get, 
as before, Aj F° irir = (F Ajcz-f Ajfl*which means in the present 

= J + (159a) 

instead of (158). 

In conclusion the following remark should be made. It can happen 
that, while the unperturbed motion is confined to a finite region and 
has accordingly, within a certain interval of energy values, a discrete 
spectrum, the perturbed motion has, within the same interval, a con¬ 
tinuous energy spectrum, which means that the perturbing forces, even 
when email , can extract the particle and drive it to infinity. An example 
of this condition is furnished by the action of a homogeneous electric 
field on a hydrogen atom. In the region of low energy values the con¬ 
tinuous energy spectrum, corresponding to the presence of the electric 
field, practically reduces to a discrete one, with each //-level split up 
(as a consequence of degeneracy) into several sub-levels. This pheno¬ 
menon is known as the Stark effect. The sub-levels in question have, 
however, a certain effective width which increases with the strength of 
the electric field and which corresponds to the phenomenon of pre- 
dissociation , discussed in Part I, § 16. This means that there exists a 
certain probability for the atom to be ionized by the electric field even 
if the unperturbed state of the atom corresponds to the lowest energy. 
The width of the energy-levels becomes, however, marked for unper¬ 
turbed states, which correspond to comparatively high energy-levels, 
where the energy spectrum of the perturbed atom becomes practically 
continuous. In the case of the unperturbed atom, the continuous 
spectrum starts at the point where the energy is equal to zero, while 
for a perturbed atom it starts below this point—and indeed the more 
below, the larger the perturbing electric field. 


21. Perturbation Theory involving the Time; General Processes; 
Theory of Transitions 

In all the foregoing developments the time has been completely 
ignored. This has been possible because we have limited ourselves to 
the consideration of such physical quantities as do not depend upon 
the time. It may seem, at first sight, that the introduction of the time 
as an independent variable into the expression of an operator, s»y. 
representing some variable physical quantity, would only have the 
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effect of making its characteristic values, and consequently the states 
specified by them, functions of the time. That this is not so is clear, 
however, from the example of the energy. If the energy operator K con¬ 
tains the time explicitly, then an equation of the type (K—K')<f> K ' = 0 
has no physical meaning and must be replaced by the general equation 

of motion , „ v 

(K+Pt)<f> = 0, (160) 

h() 

where p t = — —. The equation (K—K , )<j> K . — 0 would correspond to 

2771 - dt 

the treatment of the time as a simple parameter ; from the purely mathe¬ 
matical point of view, the appearance of the time would have no 
particular meaning, save that of making the characteristic values K* 
and the characteristic functions <f> K >(t) definite functions of the time. 
These functions, as well as the corresponding characteristic values K'(t) y 
would, however, have nothing to do with those functions <f>(x , t) which 
describe wave-mechanically the motion determined by the energy 
operator K and which are the solutions of equation (160). 

So long as K depends upon the time, this equation does not admit 
particular solutions of the type (f> = <f>° K *(x)e~ i2irK ' ,lh , which means, from 
the physical point of view, that K has no characteristic values, or, in 
other words, that the values of a variable energy cannot be specified. 

This result constitutes one of the fundamental differences between 
wave mechanics and classical mechanics, where the value of a variable 
energy can always be ascertained as a definite function of the time. 
The same refers to other operators involving the time as an independent 
variable. 

It is true that the energy is more intimately connected with the time 
than any other operator. It seems, however, doubtful whether an 
equation of the form F^ip = F'tft defining the characteristic values of 
an operator F(x) has any meaning F(x) depends upon the time—so 
long at least as the latter is treated on an entirely different basis from 
that of the coordinates x, y } z. The exceptional role of the time is revealed 
by the fact that, in contradistinction to the coordinates, it cannot be 
used for the specification of the states, the latter being referred, in 
general, to a particular instant of time. The time, therefore, cannot 
be treated on the same lines as the coordinates and other physical 
quantities, and, in particular, it cannot be represented as an operator 
or a matrix with regard to some other basic quantity. Even when 
completely ‘inactive’, the time remains above the realm of ordinary 
quantities, ruling out the very possibility of their determination (so far 
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as exact and not probable values are concerned) by its active inter¬ 
ference. 

Nevertheless, the transformation theory which has been developed 
in the preceding chapter can be applied in a somewhat modified and 
generalized form to variable quantities and, in particular, to the energy 
K of a particle moving in a variable field of force. 

If the variable part of K refers to a comparatively small force, we 
can regard the latter as a perturbing factor causing transitions between 
the states specified by the part of K which does not contain the time. 
This theory of transitions has been outlined already in Part I, § 14. 
We shall now briefly recapitulate it, using the new notation, and we 
shall point out its connexion with the transformation theory. 

The variable part of K, which will be regarded as the perturbation 
energy, will be denoted, as before, by S, and the constant part by H. 
The function <f>(x,t)> which is the general solution of equation (160), 
can be represented as a superposition of the (normalized) functions 
which correspond to the different states specified by the operator H , 
with suitably determined variable coefficients. 


Taking first the case of a discrete H-spectrum, 

we shall put accord- 

ingly 

(f)(x,t) — j?C H ’(t)i/j H ', 

(160a) 

with 

ifj H ’ = \fP^(x)e~ i27rH W 1 , 


or 

Tv 

(160b) 

where 

C u .{t) = c H \t)e~ i2nHt l h . 

(160c) 

Substituting (160a) in 

(160) and taking into account that the functions 


i/j h > satisfy the equation (H+p t )\jt H . = 0, we have 

(//+$+£>,) ^ — 2 [(Pi C H')'l J H' J r C H' S^H'] = 0- 

Since ~ 2 ^h 0 h ,1 Ph^ 

fr 

we get 11 s h-h] = 

whence ^ $h'H’Pi c ir J r ( 'ii' = 

or, interchanging H ' and H” 

< 161 > 

It should be remembered that the quantities S H H * represent not the 
matrix dements but the matrix components of the perturbation energy, 
so that $ jc£'= S () H ' H >e i2lT{Il, - lr ^ h . Further, so long as S contains the 
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time explicitly, the matrix elements S° H . H * = J Si/t H * dV must also 
be certain functions of the time, so that (161) can be written in the form 

- A %f = l S] ra .{t)e^'-^ H .. (161a) 

If we substitute in (160) the expression (160 b) instead of (160a), we 
get in the same way, without, however, separating K into the parts 
H and lS, 

(K+Pt) % C B .p u . = ^ WrP t C u .+C H .Kp a .) = 0, 

TV TV 

or, since K^ r — 

IV 

2 ftiv 2 fiirwVt Cir+K'h'H’ Civ) —■ b, 

TV TV 

or finally - A '12m. = | K° H , ir C H .. (161b) 


This equation can be derived from (161a)—or the latter from it— 
with the help of the relation (I60c) between the coefficients C and c and 
the relations 


IfO 

A H'FL- 




As already explained in Part I, § 17, the squares of the moduli of the 
coefficients c IV or C ir , i.e. the quantities 

N H \t) = C H ’C* r — c H >c* r , (162) 

can be interpreted as the probabilities of finding the particle at the 
instant t in the unperturbed state H\ or, using the ‘multiplex repre¬ 
sentation’, as the relative numbers of the copies of the particle in the 
state H' at the instant t = 0. These numbers can be determined as 
functions of the time with the help of equations (161b) or (161) if the 
initial values of the coefficients C lr (or c ir ) at some instant t = 0 are 
supposed to be known. We shall denote them in future by C° jr and 
write accordingly N H , (0) — N° }r . 

The change of the numbers with the time can be interpreted as 
the result of transitions induced by the perturbing forces. So long, 
however, as two or more of the numbers N° jr are different from zero, 
it is impossible to ascertain the original state from which the transition 
to a given state takes place. 

In order to be able to speak of definite transitions to a given final 
state from a given initial state, we must therefore assume that initially 
all the copies of the particle were in the same state, H ' say. This means 
that all the coefficients Cj r must be set equal to zero, with the excep¬ 
tion of one of them, C° jr> which can be put equal to 1. This can be 
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expressed by means of the formula 

-hr in (162 a) 

which serves to show that the coefficients C ir (t ), not only for l — 0 but 
also for t > 0, can be considered as the elements of a matrix, which we 
shall call the transition matrix and shall denote by the same letter C. 
The value of the coefficient C H .. at the time t , on the assumption of 
a definite initial state H' , will thus be denoted by 

C lr (t) = C H . H \t), (162 b) 

the initial value of the matrix G being 8 (that is, 1). 

The formula <f) = I c„ y r represents the general solution of 
Schrodinger’s equation (160). That particular solution of it which 
reduces to if; H > at the initial instant t = 0 can conveniently be denoted 
by We thus get for particular solutions of this type, which 

approximate to the particular solutions of the equation of the unper¬ 
turbed motion (H+p t )tp = 0, the following formula: 

4 * h ' “ 2 ^JZ'H'0/r* (1^3) 

ir 

which shows that the transition matrix C(t) can be regarded as the 
transformation matrix from the wave functions ip ( ] r to the wave func¬ 
tions (f) /r . The latter can no longer be denoted by (f> K , , as was done 
before, since K has no characteristic values; these characteristic values 
can, however, be replaced by a kind of ‘reminiscence’ of the particular 
solutions of the equation ( K+p t )<f> — 0 about the H- state they repre¬ 
sented at the instant t = 0. 

It can easily be shown that the functions <j> fr , etc., are mutually 
orthogonal , just as are the functions <^ A -, </> A * considered before. 

We have in fact 




0. 


Multiplying the first of these equations by (fy* and the second by <j > jr , 
subtracting one from the other, and integrating over the coordinates, 

we « et r r s 

j (ff r £* a .-tu'Xfir>dV=-J ^(tfrhr)dV, 


or, since the left-hand side vanishes (so long as K , in spite of its depen¬ 
dence upon the time, preserves the property of self-adjointness), w^e get 

= 0. 

Dd 


3505.6 
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We thus see that the value of the integral J dV does not depend 

upon the time. Since at the initial moment t = 0 we have 
and <f> H * — it follows from this that the functions <f> H ' satisfy, 
irrespective of the time, the same orthogonality and normalizing con¬ 
ditions r ^ 

J dV = Sff'H' (163 a) 

as the functions 

Substituting in these equations the expressions (163), we have further 

J 4>H m <t> h ' dV ~ ^ J *$}""'${"' dV , 

that is, J dV =^jS£ C H '"h' 

and consequently ^ C*h’h' ^H 'ir — (163 b) 

This equation shows that the transition matrix is unitary (C f = <7 -1 ), 
just as are the ordinary transformation matrices, which have been con¬ 
sidered in the preceding sections and which do not depend upon the 
time. The transformation equations (163) can be written accordingly 
in the ordinary matrix form 

ff> = or ^ (163 c) 

It follows from these results that the functions (j> H . specify 2 >erfectly 
definite states in the same sense as those which would be represented 
by the functions <f> K > if K were independent of the time and had definite 
characteristic values; the only difference between them being that the 
former vary with the time while the latter should remain constant. 

The set of states specified by the functions <f> B * can be represented 
geometrically as an orthogonal system of coordinates in the state-space, 
the transformation coefficients C H 'h' denoting the cosines of the angles 
between the fixed axes which represent the states and the movable 
axes which represent the states This movable system of axes, 
rotating like a solid body in the state-space, can be regarded as the 
geometrical representation of the variable energy K. 

One might be inclined to go a step further and to represent K by 
a quadric surface defined by the equation 

^ ^H’H* O'H.* ^ CORSt., 

thus fixing not only the directions but also the lengths of the axes asso¬ 
ciated with K —i.e. the characteristic values of the latter. This argu¬ 
ment is, however, fallacious because the preceding equation has nothing 
to do with the representation of the variable surface J£, which we have 
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been considering, but represents in reality the fictitious ‘quasi-constant’ 
energy operator K with the time treated as a simple parameter. 

The fallacy of the above argument becomes especially apparent when 
K is actually constant (which can be considered as a special case of 
a variable K). The equation K% rii *a* r a H . = const, will then 
represent K as a quadric surface fixed in the state-space. Nothing, 
however, will prevent us from solving the equation = 0 in 

this case in the same way as in the preceding case, namely, by taking 
particular solutions not of the usual if-type, <f> K . = <j>° K ,e- i2rrK ' tih , but of 
the //-type, i.e. such that, at the initial moment t — 0, <f> coincides with 
one of the functions The functions (f> H > so obtained will represent 
for t ■=£ 0 states entirely different both from those specified by the 
functions and from those specified by the functions In order 
to avoid confusion, we shall denote the characteristic functions of K 
(when they exist of course, i.e. when K is independent of the time) by 
Xk’ instead of The connexion between these functions and the 
function. V'/i■ A--2 (164, 


- 2 a irK { l j0 /r 

w 


which has been investigated before, is represented by a constant 
transformation matrix a, which has nothing to do with the variable 
matrices C and c. 

It should be remarked that the elements of these matrices are con¬ 
nected with each other, according to (160c), by the relation 




H'H• 




e i2TrH'llh 


which is not symmetrical with regard to the two indices and is in 
agreement with the unitary character of the two matrices. 

The transformation matrix a can be derived from the general equa¬ 
tions (161b) if the condition that the function should reduce to *p ir 
for t = 0 is replaced by the condition that it should be a harmonic 
function of the time of the type 

$ = Xk- = x°K'" e ~ itnK "‘ llh - (164 a) 


This means, on account of the equation 




that all the coefficients C H > should also be of the type 

<? H , = C 0 H .e~ i **K''V h . (165) 


The differential equations (161b) reduce, subject to this condition, to 
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a system of ordinary algebraic equations for the amplitudes C*} r 

IK°n'R-C° ir , (165a) 


K'"C° ir 


which are obviously identical with the equations determining the 
transformation coefficients a. 

We thus get C° H > = a H >, 


or more exactly 


CSr 


IV K"' 


l H' K'"' 


(165b) 


The relations between the functions xk' = Xk' e~ i2TrK ' i!h and 
i r = e~ i2nirtlh can be obtained from (164) if the coefficients 
are replaced by ^ = a lrK .e*™‘"-™ (160) 

These coefficients also constitute a unitary matrix f. Combining the 
matrix equations ^ = ^ and ^ = + c> 

we can easily obtain a direct relation between the functions <f> and x • 
We have, namely, , _ 

and consequently <£ -- %d, 

with the transformation matrix 


d = 

Written in matrix elements, these equations run 



^H' — 2 ^K H n , XK"y 

(166a) 

with 



or 

dK'W “ ^ a *r”ir r i/"7/' e ' 27T{ K "~ ir " )t l h . 

(166b) 

Putting 

d K ”H’ “ r )K' , H ,e>2lTK tl?> y 


we can rewrite (166 a) in the more convenient form 



<t>H' = Vk-wXk-’ 

(166c) 

with 

Vk’H' — 2 a H"‘K’ CH" ir> 

(166d) 


showing that the dependence of on the time is fully determined by 
the transformation coefficients 

Equations of exactly the same type as (165 a) are obtained in classical 
mechanics for the amplitudes of the free oscillations of a system of 
particles held together by ‘quasi-elastic’ forces, i.e. forces which are 
proportional to their displacements both from the respective equilibrium 
positions and relative to each other. Such a system can be realized in 
the simplest form by a set of coupled pendulums which can oscillate 
in a definite plane under the influence of gravity and of forces due to 



§ 21 PERTURBATION THEORY INVOLVING THE TIME 205 

their being coupled together (by means of lateral strings or otherwise) | 
Let £ v be the displacements of the given particles—or pendu¬ 
lums—from their position of rest. Their dependence upon the time is 
determined by a system of equations of the form 

= £ ( 167 ) 

dt 2 V nm * m ‘ v ] 

The coefficients <D wn thus specify the binding of the separate particles 
to their positions of rest, and so determine the free vibrations which 
they would carry out in the absence of any coupling with the other 
particles. The coefficients — <J> m „ (m ^ n) describe, on the other 
hand, the perturbing coupling forces. 

If we put 

®n n = Kn+®nn< = Km (» ^ m )> 

we can then regard the above equations as the equations of the perturbed 
motion of the given quasi-elastic system. By the unperturbed motion 
we are to understand the vibrations determined by the equations 

_ &&n „ (po £ 

dt* 


In this case each particle (pendulum or current) vibrates quite indepen¬ 
dently of the others and with a frequency 

Ztt 


In the presence of perturbing coupling forces such independent 
harmonic vibrations of the separate particles (or pendulums) are not 
possible. They become replaced by harmonic vibrations of a different 
kind—so-called ‘normal vibrations’ of the system—in which with regard 
to any kind of vibration characterized by the common frequency oj k 
all particles participate with definite relative amplitudes and definite 
phase differences. The real amplitude and the initial phase (at time 
t = 0) of each particle can be defined respectively as the modulus and 
the argument of a complex amplitude y n — |y n |e <8 «. These complex 
amplitudes and the corresponding frequencies of vibration can be deter¬ 
mined from the equations of motion if we make the substitution 

(167 a) 


for the variables Equations (167) then reduce to the form 

2®nmy m = “V«> (167 b) 


t Instead of a mechanical model we could use, for the illustration of the equations 
(165 a), an electric model, formed by a system of electrically coupled electric circuits. 
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and thus with a> 2 — K'" and <D„ m = K'} rir become identical with the 
‘wave mechanics’ equations (165a). 

The general solution of the classical vibration problem (167)—just 
as of the corresponding 'wave mechanics’ problem (K+p t )x = 0—is 
obtained by superposition of all harmonic particular solutions (with 
arbitrary constant coefficients). 

The similarity of the two problems enables us to relate the perturba¬ 
tion theory of quantum mechanics, in a very clear manner, to the 
classical theory of weakly coupled particles or pendulums. The 'pendu¬ 
lum model’ (which can serve just as well for the illustration both of the 
wave-mechanical and the electromagnetic vibrations) proves to be 
especially convenient. Such a model consists of an infinite series of 
pendulums which are suspended along a horizontal line in the order 
of increasing frequencies of the unperturbed vibrations, i.e. in the 
order of decreasing lengths, and which can be bound to one another 
in pairs (see Fig. 2). Thus each pendulum corresponds to a definite 
quantized state of the unperturbed system (atom, molecule), i.e. to a 
definite characteristic function «r- In the case of ‘degeneracy’, i.e. 
when several different pendulums have the same unperturbed vibration 
frequency v { } r = H'jh, we can ascribe the sanfe length to the corre¬ 
sponding pendulums (in general, however, a different mass) and place 
them beside one another transversely to the original direction of 
suspension. 

If, under the given conditions of the motion, there exists, besides 
a discrete set of states, also a continuous set of stationary states, then 
the discrete pendulum series of our model must be supplemented by a 
continuous series, which can be conceived as a compact heavy fabric. For 
this fabric not to tear, the amplitudes and phases of the vibration of 
its vertical elements must be continuous functions of the (unperturbed) 
vibration frequency v° = H'/h.lf 

From the point of view of the wave conception, the correspondence 
between the vibrations of our pendulum model and the vibration 
process in the corresponding mechanical system is very straight¬ 
forward and suggestive. Thus the different types of standing waves 
represented by the functions play the role of the single pen- 

t We could replace (he pendulum model by a string model (limiting ourselves to the 
fundamental vibrations of each string). The continuous spectrum in this model would 
be represented by a membrane. Such a membrane must, however, possess quite unusual 
properties which are incompatible with the ordinary equations of the theory of elasticity 
(for. these equations correspond to a coupling between the neighbouring elements of the 
elastic continuum only). 
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dulums; while the coefficients C H * (or c ir ) are the (complex) ampli¬ 
tudes of vibration. 

This correspondence acquires a purely symbolic character, however, 
when we go over from the wave picture to the corpuscular picture. The 
amplitude coefficients then acquire a quite different physical meaning; 
for their norms C H *C* r — \C H <-\ 2 then determine the relative number 
of the copies of the given particle which are in the corresponding state. 
To the continuous alteration of these coefficients with the time under 
the action of the perturbing forces there corresponds a series of forced 
transitions of these copies from one state to another. The derivative 
d\C H *\ 2 /dt then gives the probability, referred to unit time, that any 



copy of the particle will go over into the state ft-if d\C H *\ 2 jdt > 0 or 
out of this state if d\C ir \ 2 jdt < 0. 

One important difference between the pendulum model and the 
wave-mechanical vibrations it represents, consists in the normalization 
of the amplitudes of vibration to a definite value (1). A system of 
pendulums, as considered in classical mechanics, can be at rest; or if 
the system is vibrating, one has to distinguish not only the relative but 
also the absolute values of the amplitudes. So far as this model is used 
for the illustration of wave-mechanical vibrations a state of rest is 
excluded—for the particle must always be found in some one of the 
states represented by the pendulums. Moreover, only the relative values 
of the amplitudes have a physical significance as defining the probability 
amplitudes of the corresponding states—which can be taken into 
account by normalizing the sum of their norms once and for all to 1. 

In the case of certain relations between the amplitudes y n of the 
various pendulums, these amplitudes can preserve constant values, as we 
have seen above. Such ‘normal vibrations’ of the system of pendulums 
correspond to stationary distributions of the copies of the particles 
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among the different unperturbed states, and represent the stationary 
states in the presence of the perturbing forces (i.e. states defined by 
the energy K). If we introduce for the illustration of the perturbed 
motion, i.e. of the vibrations defined by the operator K , a pendulum 
model of the same kind as for the unperturbed motion (i.e. the H~ 
vibrations), then any such stationary distribution, i.e. any normal 
vibration of the original model, will be represented by the vibrations 
of a single pendulum of the new model. These new pendulums, repre¬ 
senting the transformed characteristic functions xk'> must clearly be 
considered as uncoupled. This means that transitions between the new 
stationary states (which are the real stationary states) are impossible. 

A transition between two different unperturbed states H' and //" is 
possible in the first place if the corresponding matrix element of the 
perturbation energy S] vjr > is different from zero. The coupling coeffi¬ 
cients which represent these elements in our pendulum model, can 
be regarded as a measure of the probability amplitude for transitions 
between the corresponding states. It can easily be seen, however, that 
transitions are also possible between unperturbed states H' and H" 
which are only indirectly coupled with each other, the matrix element 
S { } rjr > vanishing, but certain other elements of the type and 

being different from zero. Such ‘indirect transitions’ play, as 
we shall see later on, an important role in many physical phenomena. 

In the case of the stationary A^-states represented by a stationary 
distribution of the copies over the various //-states—or by normal 
vibrations of the pendulum-system—the transitions between different 
//-states can be imagined to be mutually compensated. 

The variable K n -states which are described by the functions <f> H > can 
be represented in our pendulum model by vibrations which at the initial 
time t = 0 involve one particular pendulum (//') only. As time goes 
on, the vibrations of this pendulum must be gradually transferred to 
other pendulums, this transference representing the gradual transition 
of the copies of the particle from the state H' in which they were 
initially supposed to be concentrated (whose probability, in other words, 
was initially equal to 1) to other states. 

If the energy K , or what amounts to the same thing the perturbation 
energy S , depends upon the time, only if H -states of this type can be 
defined and represented by means of the pendulum model, while normal 
vibrations corresponding to definite values of K are impossible. 

It is natural to consider vibrations due to an external influence, 
specified as a given function of the time, as ‘forced vibrations’. It must 
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be borne in mind, however, that the forced vibrations we are referring 
to are not of the usual type described by the non-homogeneous equations 

where F u (t) denotes the external force acting on the nth pendulum. 
Such external forces do not have any place in our model. They are 
replaced by a so-called ‘parametric perturbation’, i.e. by a change of 
the parameters <t> wm which determine the free vibrations of the pendu¬ 
lums. In fact, the case of a perturbation energy depending upon the 
time can be represented, in the pendulum model, by a type of forced 
vibrations determined by the equations 

= - 2 I 

The model will, however, adequately reproduce the actual conditions 
only when the dependence of 8 upon the time is harmonic and if, 
besides, we restrict ourselves to the case of small perturbing forces; 
otherwise the agreement between the wave-mechanical equations (161 a) 
or (161b) and the classical equations will be destroyed on account of 
the fact that in the former we have first derivatives with respect to 
the time (multiplied by hferri), while in the latter we have second 
derivatives (d 2 € v /dt 2 ). This difference is immaterial only in the case of 
harmonic vibrations represented by exponential functions of the type 
e i2nvl , the differentiation with regard to the time being in both cases 
equivalent to multiplication by a real constant. 

The preceding theory can easily be extended to the case of a con¬ 
tinuous or mixed energy spectrum of the unperturbed motion. 

Writing, for example, 

J dH" (lt>8) 

instead of (160 a), we get 

(B+S+pM 

= ^ + J ^H"] dM" 0 . 

We have further 

Sipfj* 8 h »> }r »+ J Sipu"" dH m> , 

e e 


3595.6 
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where H' and H m refer to the discrete and H" and H"* to the continuous 
region of the //-spectrum, and consequently 

-5B| (l6gi) 

= J, J S H . H ,.,.c n ...,dH m J 

The only difference between the discrete and the continuous case is 
that in specifying the states we must, in general, replace the discrete 
values of H' by elementary regions or ranges of H”, the number of the 
copies belonging to the range AH* being equal to j* \c H *\ 2 dH* —provided 

AH' 

the functions \fi H * are duly normalized according to the equation 

J = h(H"—H”) or f hi-tir- dH” = 1. 

It should be remembered that this condition is equivalent to the usual 
normalizing condition J dV = 1 for the quasi-discrete functions 

J hr dH". 

(AH') 

With the help of the latter the case of a continuous spectrum can be 
dealt with in exactly the same way as the discrete case, provided we 
start with finite ranges AH * and pass to the limit AH * 0 after having 

calculated the coefficients c. 

The actual determination of the perturbed motion by the method of 
transitions explained above, both in the case of a variable energy K 
and in the special case of a constant K, can be carried out by means 
of a process of successive approximations, based upon the following 
consideration. If there were no perturbation, then the coefficients c 
(but not Cl) would remain constant, preserving those values c° which 
they were supposed to have at the initial moment t — 0. The action 
of the perturbation will be to modify these values, so that we can put 
c(t) — c°+Ac(£) and consider Ac(t) as a small quantity—for sufficiently 
weak perturbing forces and, in general , for sufficiently small values of t. 
The latter condition constitutes an important restriction of the validity 
of the approximation method in question—a restriction that does not 
have any equivalent in the alternative method dealing with stationary 
states and not involving the time (if K does not depend upon the time). 

It is, however, perfectly natural from the physical point of view, 
since, in the determination of transition probabilities, we have to limit 
ourselves to short intervals of time. Regarding the matrix components 
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Sh h" small quantities of the first order, we can put 

c jrW — c ?f / +^i c / 1 r'(04'^2 c if , W+-** 

and obtain the corrections A 1 c ) A 2 c, etc., by the usual scheme of suc¬ 
cessive approximations. 

Confining ourselves again, for the sake of simplicity, to the case of 
a discrete spectrum, we obtain a chain of equations starting with 

~ 2 Vi It Al Ch ' = J* Sh ' b ' C " h ' (169) 


(first approximation), 


h d 


A2 C /T “ Sff'H* 


(169a) 


(second approximation), and so on. Since the matrix components 8 H > H * 
are known functions of the time, equations (169) can be integrated 
directly with the result 

t 

Aj c H '(t) — — ^ c H r j Sh h" dt> (170) 


which, on substitution in (162 a), gives 

t v 

A 2 ~ —jry 2 2 C ° H " j ^ J ^ (170 a) 

H m h" 0 0 

In a similar way one can obtain an expression for A n a H \t) which is of 
the nth order with respect to the small quantities -w. etc. 

The function & can usually be represented in the form of a product 
of a function of the coordinates and a function of the time: 


S = T(x,y,z)f(t), (171) 

or more generally as a sum of terms of this type. We get accordingly 
<W = (171 a) 

and j S rrH .(l') df = (171 b) 

0 

where v H > H * = (W—H n )jh and 

/„(<) = (171c) 

o 

This function can be defined as the amplitude coefficient in the Fourier 
integral representation of the function/(*') within the interval 0 < t f < t, 
or more exactly of a function which is equal to /(*') within this interval 
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and vanishes outside it. The latter function 


§21 


m = / fy(t)e~ i2 " vl dv 


can replace the actual function f(t) so far as we are interested in the 
results produced by the perturbation S during the limited time t. 
Turning to the quantities N H , = |c /r | 2 , we get 

N H ' == l c ?,r'l 2 +( c /r A 1 c^.)4-|A 1 c h -| 2 + 

+ (< 7 * ^2 c /r+ c F^2 c r) + ’“ • 

Terms of higher order will not be needed in future and have accordingly 
been dropped. In the particular case when d\ r = 0, this expression 
reduces to == (172a) 

If initially the particle were supposed to be in a definite state, //' say 
(so that equations (170) and (170a) reduce to 

t 

Ai c h”H' — — j~ J dt* ( 173 ) 

0 

(with H* and H f interchanged) and 


^2 C H”U' “ 


t 1 

2 ^ dt' J* dt S H » H \t ). (173 a) 


These equations give the first and second approximation for the 
elements of the ‘transition matrix’ We need not consider here 

their geometrical representation (as determining the angles between the 
fixed H -axes and the rotating K n -axes in the state-space), since it is 
identical with that of the transformation coefficients a, discussed in § 19. 

It is also hardly necessary to point out the way in which the pre¬ 
ceding equations can be generalized to allow for the presence of a con¬ 
tinuous or mixed spectrum; all we need to do in this case is to replace 
the sums wholly or partially by integrals extended over the continuously 
variable parameters. 

The equations (173) and (173 a), as well as the higher approximations 
for c IV > lVi can be obtained in a more straightforward, though somewhat 
symbolic, way by considering the coefficients c H ^ H \t) as a matrix and 
writing the equations (161), which serve to define them, in the matrix 


form 


h dc 
2ni dt 


= Sc. 


(174) 
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We thus get, treating S as an ordinary function of the time, 
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c(t) = e h 0 c(0), (174a) 


r 

or putting, for the sake of brevity, ~ I S dt = R and expanding the 

o 

exponential in a power series 


c(t) - (1 —iJK—(174b) 

This formula contains the two equations (173) and (173 a) as corre¬ 
sponding to the terms of the first and second order in the expansion. 
It is self-evident that all the multiplications must be carried out in 
the order stated, according to the general rule of matrix multiplication, 
and that, moreover, the matrix c(0) must be defined as the unit matrix 


0 ) =- $/r'ii ' 


It may seem at first sight that there is a discrepancy between the 
expression (173 a) and the second-order term of (174 b) 

A 2 £//"//' = ^ “2 2 ^H’H'"H'i 

ii- 

i r 

i.e. & 2 C H 'H' — ~ ^ J ' J tiir’H'dt- • (174 c) 

H "' 0 o 


As a matter of fact, they are easily seen to be identical (by a generaliza¬ 
tion of the well-known relation for multiple integrals with the same 
variable). 

Since the first factor in (174 a) is a pure imaginary, we get at once 
the relation c.'(t)c\t) =••= c f (0)f(0) = 8, 


which means that — 1 in agreement with the elementary 

theory of Part I (§ 18) or with the formula (163b) of this section. 

It should be mentioned, in conclusion, that the case of a variable 
perturbation can be dealt with by a method similar to that of Born 
for the case of a constant perturbation in the theory of stationary states 
(§ 20). We can, in fact, determine the functions <f> H >, which are the 
particular solutions of the equation (//+$+P*)<£ = 0 reducing to 
= ip° H ' at the initial instant / = 0, by putting 

<f> H ’ = + 



214 


§21 


PERTURBATION THEORY 
and integrating successively the chain of equations 

(H+p^A^ir = — Sifijf, 

(H+p t )A i'Ph'= —SAjtfifj’, 

etc., subject to the condition that A-^#' — A 2 — ... = 0 for t = 0. 
This method can be advantageously applied in the case of continuous 
spectra. It is, of course, completely equivalent to the method explained 
above, differing from it only by avoiding the use of the coefficients c. 


22. First Approximation; Theory of Simple Transitions 

The study of transitions produced by a perturbing force can con¬ 
veniently be divided into two parts, corresponding to the first and to 
the second approximation of the general theory. The first-order terms 
determine the probability of simple (or direct) transitions between two 
states, which have been dealt with already to some extent in Part I, 
§18; while the second-order terms mainly determine the probability of 
combined transitions, involving intermediate states. 

So far as the action of variable forces is concerned, we shall restrict 
ourselves to the case of a harmonically oscillating force represented by 
the expression (171) with f(t) = cos(2t rvtf+jQ). In the general case of 
a force represented by a sum (or integral) of terms of this form with 
different frequencies v , Areduces to the sum (or integral) of parts 
corresponding to the separate harmonic terms of S. 

Putting f(t) — J[e«*^+^)q.c” <(2irW+ ^], we get, according to (170), 
(169 b), and (169 c), 


Ai c h * h ’ ~ — 2 2’zr/r 


H n -H'+hv 
which can also be written in the form 


giijriH'—H'+hv)tjh j ^vliriW-H -hv)tfh_ ^ 


J’ 


(176) 


a i C u ,, ir — 1 H'H' e H -;-r e H - 

2h V H'H’ + V V H . U —V . 


, (175a) 


involving the transition frequencies v IVH > — (H n ~H')jh instead of the 
energy values. 

As pointed out in Part I, § 18, these expressions, regarded as func¬ 
tions of the time, have two entirely different characters depending upon 
whether the absolute value of the transition frequency v H » jr coincides 
with v (‘resonance") or not. 

In the latter case A iC H » H > oscillates about the value zero, while N H », 
as determined by (172 a) (for a state H" different from H f ), oscillates 
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about a small (positive) average value 




\ T H'H’\ 


L(i ir-H'-\-hv ) 2 5 (H*-H'-h v ) 


l 9 


(176) 


representing the average number of copies of the particle in the initially 
vacant state H*. 

In the case of resonance (v — db y?r/r) one ^ wo f erms * n 

square brackets becomes infinite, which means that a stationary dis¬ 
tribution is impossible, i.e. that the number of copies in the state H H 
is steadily increasing. With the help of the formula 


e* 277 ^—1 

Inn —— — 
f-*o f 


= 2t rit, 


we get in this case, according to (175), 


Aj i T Q H * H \e. ±i fcirit+ periodic term] 

Jitl 


(the positive sign referring to H n > H' and the negative sign to 
H" < H'), that is, dropping the periodic term which remains small 
while t increases: 2 

= (176a) 


A perturbing force is usually said to induce transitions from the state 
H' to H" only when these transitions are manifested as a systematic 
increase of N ir with the time, i.e. in the case of resonance. In the old 
quantum theory the resonance or frequency condition was regarded as 
the expression of the law of the conservation of energy on the assump¬ 
tion that light of frequency v can be absorbed or emitted in energy 
quanta of the magnitude hv. We see that this relation is by no means 
confined to light, being valid in the case of harmonic oscillations of any 
kind.—To the type of resonance implied there corresponds in our 
pendulum model not ordinary resonance between the external force and 
the free vibrations of a definite pendulum, but what in classical 
mechanics is denoted by ‘parametric resonance’, which means the co¬ 
incidence of the frequency of the variation of the coupling S° irH . 
between two pendulums H' and H ” with the difference of the fre¬ 
quencies of their free vibrations (corresponding to the absence of the 
coupling). It can, in fact, easily be shown that under this condition 
even a very weak harmonic variation of the coupling coefficient 
must produce a steady transfer of energy from the H'-pendulum (sup¬ 
posed to be initially the only one set in motion) to the If "-pendulum 
while all the other pendulums H w for which the condition of parametric 
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resonance is not fulfilled will perform oscillations of small amplitude 
without any tendency towards a steady increase. 

The quadratic increase of N H , with the time according to (176 a) 
corresponds to a transition probability (referred to unit time) 

p _ dN H * 2tt~ , 


dt 




which is itself a linear function of the time. 

This result is due to the exact coincidence between v H » fr and v {sharp 
resonance), which is practically never realized in nature. It has been 
shown in Part I, § 18, that in the case of ‘nearly-monochromatic’ light, 
formed by a spectral line of finite width, N jr becomes a linear function 
of the time and the transition probability F/r/r becomes a, constant. 
The same is true, of course, of any nearly-harmonic perturbation. 

We shall return to this question in the second part of this section 
where it will be dealt with by a different method. 

The preceding formula cannot be directly applied to the special case 
v — 0 corresponding to a perturbing force which does not depend upon 
the time. We must, namely, take into account the fact that in the 
case v > 0 only one term of (175) is effective in producing transitions 
from the state H' to the state with higher energy H” = H'+hv, while 
the other would be effective in producing transitions from //' to the 
lower level H" = H'—kv (if such a level exists). Now when v = 0 both 
terms of (175) become equally effective for the transition H' -> H* 
(more simply, the splitting of S into two terms becomes meaningless). 
We thus get 


Ai c H r H’ — 


e i Zir{H" - Jjyjli l 


IV w 


H"—H' 


.00 1 

(177) 


v H 0 ir 


whence 






ll^wi 2 

(H"-H ') 2 


(177a) 


if H H ^ H\ and 




4 7T 2 
A 2 




irn' I 


(177b) 


if H" — H' , which is the resonance condition in the present case. This 
type of ‘inner’ resonance is faithfully reproduced in our pendulum 
model by the resonance between the pendulums representing the unper¬ 
turbed states H‘ and H”. It will be noticed that the expression (177 b) 
differs from the corresponding expression (176 a) for the case v > Oby 
a factor 4 in the numerator. 

The quantities etc., have the effect of slightly dis¬ 

turbing the resonance between the corresponding pendulums, while the 
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quantities S° jrir describe the perturbing coupling forces. As long as 
the latter arc weak and there is no resonance, there corresponds to the 
unperturbed vibration of each pendulum (//') a perturbed normal vibra¬ 
tion of the whole system (K') in which this particular pendulum plays 
the principal role, while all the others only faintly accompany it. This 
state of affairs is described by the formula a irK > -- °f 

§ 19, where a irK . are the transformation coefficients between the func¬ 
tions x^k' an< ^ *A/ri ^ ie sma ll quantities A a jrK > represent the participa¬ 
tion of the pendulums H " ^ //' in the normal vibration K', corre¬ 
sponding to the unperturbed oscillation of the pendulum H' alone. We 
might expect the quantities N ir —or their average values—to be equal 
to the square of the moduli of these small quantities. As a matter of 
fact, we have, according to (148 b), 


Aj a 


H'K' — 


II" —H’ 


and consequently |A x a irK .| 2 = 

which is equal to one-half of the value of N as determined by 
(177). 

This discrepancy is explained by the fact that the quantities 
!Ai a if]£' | 2 refer to the stationary states (xk') °f perturbed system, 
while the quantities (177 a) refer to the non-stationary states <f > jr , or 
more exactly to the initial stages in the development of these states— 
as follows from the method of approximation used in deriving equation 
(177). The limitation to the initial stages is practically irrelevant so 
long as the quantities c }riv remain small, i.e. so long as there is no 
resonance ( H" =/- H'). It becomes, however, of primary importance in 
the case of resonance, the formula (177 b) being valid for small values 
of t only. 

The actual conditions met with in this case can be best understood 
with the help of the pendulum model. If initially only one pendulum, 
H' say, were set in motion, then, however small the perturbing forces 
which couple it with other pendulums, those which are in resonance 
with it will gradually acquire large vibration amplitudes (while the rest 
will but faintly accompany them as before). Resonance thus excludes 
the ‘dominance’ of one particular pendulum in the perturbed vibra¬ 
tions: all the pendulums which are in resonance with each other become 
equally important in the vibrations started by any one of them. 

In the simplest case of two coupled pendulums in resonance we obtain 

3595*6 p £ 
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the following well-known results: If originally (when t = 0) only one of 
the two pendulums was vibrating, then its vibration energy must 
gradually go over to the second pendulum. If both pendulums are 
identical, this process goes on until the first pendulum comes to a stand¬ 
still and the second takes over its role. Similar beats , i.e. relatively 
slow periodic increases and decreases of the vibrations of one pendulum 
at the cost of the other, must take place with any relations between 
their initial amplitudes and phases—except in two cases: ‘symmetrical' 
vibrations with equal (real) amplitudes and phases, and ‘antisymmetri- 
cal’ with equal amplitudes and opposite phases. In these exceptional 
cases the vibrations maintain a stationary character, i.e. their ampli¬ 
tudes remain constant. The symmetrical and antisymmetrical vibra¬ 
tions have somewhat different frequencies, both of which are, in general, 
different from the common unperturbed vibration frequency of the 
pendulums. 

The non-stationary vibrations can be represented by a superposition 
of the two kinds of stationary vibrations. The frequency of the resulting 
‘beats’ must obviously be equal to the difference of the two funda¬ 
mental frequencies. 

These results can easily be generalized to any finite number, r' say, 
of coupled pendulums in resonance. In the first approximation their 
coupling with other pendulums can be neglected. The resulting vibra¬ 
tions of the resonance group can be represented as a superposition of 
r' independent normal vibrations with different frequencies. By suitably 
adjusting the amplitudes (and phases) of these normal vibrations, a 
resulting vibration can be obtained such that, at the instant t = 0, one 
pendulum only— H' say—is in motion. The amplitudes of the others 
will then at the beginning increase linearly with the time and their 
energies increase proportionally to t 2 , this dependence being restricted 
to such values of t as are small compared with the ‘beat periods’, that 
is, the reciprocals of the frequency-differences between the different 
normal modes of vibration. 

These results can easily be obtained from the general theory embodied 
in equations (161a) and (161b) of § 21. It should be remarked that, 
although equations (161a) must be used for the approximate calcula¬ 
tion of the numbers N H * (for the coefficients c H * can be supposed to 
be approximately constant while the coefficients C jr cannot), equations 
(161b), with the coefficients K%* ir which are independent of the time 
are more appropriate for the discussion of the case of resonance, because 
of their similarity to the equations which determine the vibrations of 
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a system of coupled pendulums—the only modification consisting in 

. . d' 1 , h d 

replacing ^ by — —. 

If the coupling between the pendulums (i.e. //-states) not belonging 
to the resonance (degenerate) set in question and those which belong to 
this set is neglected, then the quantities C ir for the latter pendulums 
can be determined by the system of r' equations 


h d n 

^idt IV 


1 


rr~H’ 


or in the notation corresponding to equation (154 b), 


h d 
2 ni dt 


--= I A'L,C' )t 

n -1 


(m -=■ 1 , 2 ,.. 


r). 


(178) 


With the help of the relations 

C — r e-tovH’Wi and K 0 — S II'4-S n 

““ l m e ami **- m n °»in 11 i °«ii 

these equations can be reduced to the form 

_ !L c — y S° c 

2iri dt nl ~ n ^ mn ,r 


(178a) 


The latter equations can be derived directly from the general equations 
(161a) in the same way as equations (178) have been derived from 
(161b), in conjunction with the condition IJ m = H n = H' (i.e. 
S w „ = Sy nn ), namely, by dropping terms connecting the states which 
belong to the same energy //' with those which belong to different 
energy-levels. We have preferred, however, the indirect derivation in 
order to preserve throughout the analogy with the classical theory of 
the pendulum model. So far, however, as the results are concerned, the 
r' states of the same energy H' can be represented equally well by two 
systems of r' pendulums whose oscillations are determined either by 
equations (178) or (178 a). 

Taking equations (178a), we can first of all obtain the normal 
vibrations (i.e. the /^-stationary states) by putting c n = a n e~ i2rr ^ Hflh 
[or C n — a ll e~ i2nK ' ilh in the case of equations (178)], whereby it reduces 
to the system of equations (154b), which was obtained by another 
method in § 19. After this, the general solution of (178 a) can be written 
in the form r ' 

c, = Ir.<«„rW, (178 b) 

where the (A H') 8 are the solutions of (154 c) and the a n8 are the corre¬ 
sponding normalized solutions of (154b), while the y 8 denote arbitrary 
constants. As already mentioned, these constants can be adjusted in 
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such a way as to make all the c m vanish at the initial instant i ~ 0 
with the exception of one of them, c m say. This particular set of y 8 
can conveniently be denoted by y am . 

We have, for their determination, the system of equations 

I (179) 

8 

which shows that the matrix y is identical with a 1 or o f . We thus 
get, writing c nm instead of c n , 

c„m = 2 a l „a*.erWi™, (179a) 

£1 

or C .- 2 «»»««« e ' l ' 2 ’ r (179 b) 

Multiplying these expressions by their conjugate complex, we get 

-22 P^"cos 2 ”{K~K-,)t, (179c) 

where is the real part of the product a ms >. 

We thus see that N m is represented as a function of the time as 
a sum of constant terms (s' — s) and terms oscillating with the 
‘difference-’ or ‘beat'-frequencies v S3 > ~ (K 8 —K'J)jh == (\ir H —AH' s >)/h. 
So long as the product of the time t with these frequencies (which are 
the reciprocals of the ‘beat periods’) is small compared with 1, we can put 



which gives, since N m vanishes for t -- 0 (unless m — n ), 

^-£( 22 <« 4 ’ 

This expression coincides with (177 b) if 

22/>r>v= -!‘ s "4!'- 

8<S' 

It can easily be shown, with the help of equations (154b) and (154c), 
that this relation actually holds. AVe shall not, however, give the 
proof of it here. 

It may be remarked that equation (179) reduces, subject to the same 
condition or rather subject to the condition A H 8 tjh < 1 (for all s), to 

while equation (177) gives, in the case of resonance, 

i2n 
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from which, by the way, it follows that 

= SXX,«( a//').. 

A 

This relation can be derived from the equations 

2 ^mn^ns ~ 
n 

by multiplying them by and summing over s. We thus get 

2 = 2 S ln Ku' = ft," 

.s ns n 

Further, it should be mentioned that an expansion of the same type as 
that for the coefficients c IVm is not possible for the coefficients C H > m , 
as determined by (179 b), on account of the large value of the fre¬ 
quencies K'Jh. More exactly, the approximate expression C nm l 
would be valid for exceedingly short times only (small compared with 
the reciprocal of K'/h), which hardly come into consideration. 

The resonance between the r* states we have just considered corre¬ 
sponds to an absolute degeneracy between these states in the sense of 
the perturbation theory not involving the time. In the present theory 
we need not, however, distinguish between this case and that of a 
‘relative degeneracy’ (§ 20), so long as the energy-differences (It' — II") 
between the states under consideration are small compared with the 
corresponding matrix elements of the perturbation energy If 

the ratios S H > jr j(U'—H") are large compared with 1 we can still use 
the expression (177b) for the probability of the transition II' -> II" 
provided the time t is small compared with the reciprocal of the ‘beat 
frequency’ (//" ~-H’)jh. In the contrary case we must limit ourselves 
to the expression (177 a) for the average value of the probability of 
finding the system in the new state II". 

We have, hitherto, confined ourselves exclusively to the case of a 
discrete //-spectrum. The modifications of the general theory which 
are necessary in order to allow for the presence of a continuous or mixed 
spectrum in a limited or unlimited range have already been indicated 
in the preceding section. They necessitate, however, an important 
revision of the approximate theory for the case of resonance between 
states belonging to a discrete set, on the one hand, and states belonging 
to a continuous set on the other (and also between states belonging to 
two different continuous sets). The essence of this revision consists in 
the replacement of the idea of sharp resonance , referring to two exactly 
determined states, by that of unsharp resonance for a narrow range or 
‘band’ of final states belonging to a continuous set. 
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Let us consider transitions which are produced by a perturbing force 
vibrating harmonically with the frequency v. The initial state will be 
supposed to belong to a discrete set and to have the energy Ii'. If the 
energy H'+hv lies in the region of the continuous spectrum (as can 
happen in the case of a hydrogen-like atom if H' < 0 while H'+hv > 0), 
then transitions will be produced not only to the state with the energy 
H" v — H'+hv, but also to the neighbouring states whose energy H" is 
slightty different from H" v . This follows from two considerations. 
Firstly, the resonance condition H" = H'+hv need not be exactly 
satisfied even when the final state belongs to a discrete set. Secondly, 
the neighbouring states of a continuous set are themselves approxi¬ 
mately in resonance with each other and cannot therefore be considered 
separately. We must consider instead a ‘band’ of neighbouring states 
or, in other words, a ‘wave group’ formed by the superposition of the 
harmonic waves representing them. 

According to the general theory, we obtain for the coefficient c /r of 
the functions ifj u , belonging to a continuous set exactly the same 
differential equations as for the coefficients of the functions belonging 
to a discrete state. If the particle were supposed to be initially in the 
(discrete) state H', then we have in both cases the same expression for 
c H . = c H . H > , namely, (175). Limiting ourselves to states in the neigh¬ 
bourhood of the resonance state with the energy H " = HI = H'+hv , 
we can drop the first term in (175) on account of its relative smallness, 


so that 


Ai C H*H‘ 


pi2rr(H’—If — hv)t/h_ l 

-k T H’H‘ e ‘ W-tF-kv ' 


If the functions are duly normalized, the number of copies of 
the particles that have passed during the time t from the state H' into 
a range AH" about the resonance value H" v is given by the expression 

~ J dH" . (180a) 

Air 

Before carrying out the integration over H" we must notice that this 
integration actually refers to the energy alone if the other two para¬ 
meters specifying the wave functions ift H . remain discrete (as, for 
example, in the case of the hydrogen-like atom). If one or both of 
these parameters are continuously variable, dH * must be replaced by 
the product of dH" with the element or elements of these continuously 
variable parameters.* Leaving this case aside, we can calculate (180a) by 
integrating over the energy alone. 
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Since the last factor in (180) has, for not too small values of t, a very 
sharp maximum at the resonance point H" = H* v and comparatively 
very small values outside the immediate vicinity of this point, we can 
replace the first factor by its value for H" — H” v and extend the 
integration over the difference H"—H” v from — oo to +oo. 

Putting, for brevity, 2 t rr(H"—H'—hv)tjh = £, we then get 


N, 


ah; 


m 


0 |2 27r f 

,nir\ T J 


e*£— 1 


dt 


Since [e'f—1| 2 = 2(1—cos £) = 4sin 2 ££ and 


this gives 




(181) 


The probability of a transition from the state H' into the band A H* v 
per unit time is thus equal to 


r*Ui- = j\niir i 2 - 


(181a) 


The same result could be obtained with the help of the quasi-discrete 
functions 


ifi jr ~ Hm 


V(A# 


tii") J 

AH' 


<UV . 


We must first consider the intervals A H" as finite and calculate the 
coefficients c H * H > = A x c H * H > according to formula (180) with the matrix 
elements T° rrH r replaced by 

n-H' = J tft- Ww dV S j~gr ) jdV J Wh- dH" 

AH' 

= 4W) / Ww dV = J(AH’)T» lrH : 

This formula is the more accurate the smaller the interval AH", We 
can therefore use it in the calculation of the limiting value of the sum 
^ |Ac H ^ r | 2 = ^ |Ac //vr | 2 AH" extended over a large number of in¬ 
finitely small intervals containing the resonance value H This limiting 
value is obviously nothing else but the integral (180a). 

An important example of transitions of the mixed type just con- 
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sidered is the ionization of an atom by the action of light, i.e. the 
photoelectric effect. In this case we can put 

S = —eE 0 coB(27Tvt+fi), 

where E 0 is the amplitude of the electric vector of the light waves, 
supposed to be parallel to the #-axis, and e is the charge of the electron. 
This gives wV jBS, 

h 


r, 


h;h' 


p \ x h: h' 


(181b) 


Let us now turn to the case v = 0 corresponding to a perturbation 
which does not depend upon the time. The transition being again from 
a discrete state H' to a continuous range of states H " belonging to 
approximately the same value of the energy, we can determine its 
probability per unit time by the formula (181a), putting T S and 
introducing the factor 4, for the same reason as in the formula (177 b) 
[in contradistinction from (176 a)]. We thus get 

rw=y iswi 2 (#' = #')• (m 

Another—purely formal—modification which must be introduced for 
the case v — 0 refers to the notation. If the continuous spectrum over¬ 
laps the discrete spectrum (which is necessary for the resonance con¬ 
dition H” = H’ to be satisfied), we must introduce explicitly one or 
two parameters in order to distinguish the different states (continuous 
and discrete) which have the same energy. Denoting this parameter 
by Q , we can rewrite (182) in the form 

I 2 * (182 a) 

If, finally, the parameter Q” is continuously variable and if a range of 
the continuous spectrum is specified by the product 

o(H\ Q”) dH n dQ\ 

where a is a certain function of H” and Q" such that the probability of 
finding the particle in the above range is equal to 

\c H . Q .\*oifl\Q”)dU*dQ\ 

then the probability of a resonance transition from the sharply defined 
state H'Q ' into a band corresponding to the interval dQ n is given by 

r d 0 -,«- = x i-sWdM *<n d<y. (i82 b) 

The same modification applies to a resonance transition produced by 
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a harmonically vibrating perturbation. Instead of (181a) we then get 

IW = jin-Q-.n' q-MK Q") *Q'- (182 c ) 

It can easily be shown that these formulae remain valid when both 
the final and the initial states belong to a continuous set. We come upon 
this case in collision problems of the simplest type such as the deflexion 
of a particle by some field of force practically limited to a finite region 
of space, the initial and final states (‘before' and ‘after’ the collision 
with the source of the perturbing field) being described by wave func¬ 
tions corresponding to the motion in the absence of this field. 

If, however, the final state belongs to a discrete set, then the initial 
state must be specified unsharply, i.e. by a certain range of IV (and 
eventually also of Q'). 

In conclusion the following circumstance must be pointed out. From 
the corpuscular point of view resonance means the conservation of energy. 
The fact that perturbing forces practically produce only those transi¬ 
tions which satisfy the resonance condition can be regarded from this 
point of view as the natural consequence of the law of conservation of 
energy. As we have seen, however, the resonance condition is not 
strictly obeyed in wave mechanics. First of all, transitions of a non- 
systematic character are produced from the initial state to states with 
an entirely different energy, the average probability of finding the 
particle in these ‘stray’ states being given by the formula (183 a). 
Further, in the case of a continuous spectrum, the systematic transi¬ 
tions are governed by the condition of unsharp resonance, implying 
slight deviations from the law of conservation of energy. It thus seems 
that the latter does not strictly hold in wave mechanics. 

This conclusion is, however, wrong, for the simple reason that II does 
not represent the actual energy of the particle, this energy, if the per¬ 
turbation S does not depend upon the time, being specified by the 
characteristic values of the operator K ™ The resonance equa¬ 

tion H n = IV is therefore merely an approximate expression of the law 
of conservation of energy which in reality should be expressed by 
K" = K\ 

As a matter of fact, if the motion of the particle is described from 
the point of view of K, i.e. by means of the characteristic functions of 
this operator, then a set of stationary states is obtained between which 
no transitions are possible, irrespective of whether K" = K’ or K* ^ A r '. 
It is only when the motion of the particle is described from the point 

3S95- 8 n nr 
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of view of H that transitions appear, produced by the neglected part 
8 of the total energy K. It is precisely this ‘misuse 5 of the energy S 
which is the cause of the apparent violation of the law of conservation 
of energy. From the point of view of H 3 S is not a constant—unless it 
commutes with H , which, in general, is not so—and therefore has no 
definite value. It can therefore be regarded as the ‘goat 5 responsible 
for the deviations from the conservation law H n — H' in the transitions 
for which this equation is not satisfied. 

A similar consideration applies even more strongly to the general 
case in which 8 does depend upon the time, for in this case the values 
of the total energy K remain undetermined. 

23. Second Approximation; Theory of Combined Transitions 

The preceding considerations pave the way to an understanding of 
transitions the probability of which vanishes when derived from the 
equations of the first approximation but does not vanish when estimated 
with the help of the second approximation. 

According to equations (173) and (173 a), we have this case if the 
matrix component S H " H * vanishes, while there is one or several states 
H"' such that the components 8 H » H "> and 8 ir » H > are both different from 
zero. 

For the sake of simplicity we shall first consider the case of discrete 
states together with a perturbation independent of the time. If there 
is no resonance between the initial and final states, i.e. if H" ^ H\ then 
the probability amplitude, c IVH . = A 2 c irir , of finding the particle in 
the state H” will remain a small quantity of the second order, and the 
square of its modulus N ir will oscillate about an average value of 
the fourth order of smallness. If, however, H* — H\ c IVH > will increase 
linearly and N H * will increase quadratically with the time, which means 
that there are systematic transitions from the initial state H f to the 
final H H via one or several intermediate states H m . For these inter¬ 
mediate states the resonance condition with the end states need not 
(and in general cannot) be satisfied; the fact, however, that in the 
combined transitions H ' -> H f ” -» H" the particle has to pass through 
a state with an energy W different from the initial (and final) value 
does not in the least prevent it from making such transitions. The 
apparent violation of the energy law for each of the two ‘legs’ of the 
jump from W to H” can obviously be straightened out by taking into 
account the perturbation energy 8 not only as the cause of the transi¬ 
tion but also as an invisible factor in the energy balance. If, for instance, 
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H w > H then we can imagine that the energy which is 

required for the first step of the transition, is ‘borrowed’ from the 
perturbation energy S and restored to it during the second step. 
The probability amplitude c /r ~ /r = A 1 c ir ' ir , of the state H"\ will 
remain small, the corresponding probability (or number of copies in 
the state IF") N H ". = oscillating about the constant value 

2\S H ". ir \ 2 l(H'"—H') 2 y while the number N ir , though initially much 
smaller, increases with the time, and may finally become very large. 
We can visualize this process by imagining each state as a vessel which 
may be filled with a liquid representing the probability or the number 
of copies. This liquid is initially concentrated in the vessel IF and is 
pumped by the perturbation to the vessel H" with which it is connected 
indirectly through a set of vessels H"'\ the liquid does not, however, 
accumulate in the latter—just passing through them and accumulating 
in H". A still better picture of this transition process is provided by 
our pendulum model, the probability or number of copies being repre¬ 
sented by the energy flowing from the pendulum IF to the pendulum 
IF which is coupled with it through the pendulums IF". The lack of 
resonance between the latter and IF results in these pendulums per¬ 
forming steady oscillations of small amplitude and functioning simply 
as carriers of energy from H' to H". 

After these preliminary considerations of a qualitative character, w r e 
can pass to the quantitative theory of the double transitions. Putting 
in (173a) = S° jr 

and S IV ,. W = 

we get & 2 c h w h' — I Slrir"Sjr"irfirir"ji'(t)> (183) 

// ' 

where * r 

fn-H'-iA 1 ) = J dt' j dt" e ^nr--Hr% 

o a 


that is, 


2iri 

T 


t. 

i 


dt' e itTr(U"-wyih 


e i27T(/r"~ j/t/a __ i 

ir-H’ ’ 


e i2n{ir-nyih^_ \ e i2rrdi"-iryih \ 


(183a) 


In the case of resonance II" — H' this expression reduces to 

i27Tt e «ir(U'-7/'"XM_ i 
jH-n-'Hi) h(H m -H') + ' 


(183 b) 
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Dropping the second term on account of its smallness, we thus get 


^2 C H"IV 


it V Sjrir" Rji'-n 1 

h Zz ir-ir * 


(183c) 


We did not replace S n jrH »> by S° irn n, in spite of the fact that IF = H r 
in order to indicate somehow that the final state is different from the 
initial one. This can be done in a clearer way by introducing the 
additional suffix Q and writing 8 \instead of 

In the case of double transitions, just as in the case of simple transi¬ 
tions, one usually has to do with an unsharp resonance between the 
initial state and a band of continuously variable final states. If the 
energy is the only continuously variable parameter, the probability of 
transition from II'Q' to H'Q" in the time t is expressed by the integral 


~ J \&2 c irirWr dll" 


extended over the neighbourhood of the resonance value IF — IF. In 
carrying out the integration wc can drop the second term in the expres¬ 
sion (183 a). With this condition wc must obviously get the same result 
as for the simple resonance transition IF -> IF, with the matrix dement 
S° irH ’ replaced by the expression 

Y iv" ^//"'/r 
Z, IF'-H' ' 

We thus obtain for the probability per unit time of the transition 
H' -> IF the following formula [cf. eq. (182a)]: 


1 iviv 


4tt_ 2 

h 


i 


fit 0 .<?(> 
H"-H' 


(ir = H’). 


(184) 


This formula is not complete in two respects. Firstly, it does not take 
into account other parameters (Q) in addition to the energy. Secondly, 
it neglects intermediate states belonging to the continuous energy 
spectrum. If the parameter Q is discretely variable, we get, instead of 
(184), the expression 


r _ 4ff! vv v’wq-m 
Y ZZ“., 


S°,rc 


y f 


1 II Q" 


•Q'" $H'"Q"',irQ' , 


+ 


^ J 

Q" J 


dH"" 


(184a) 


If Q is itself continuously variable, then the summation over Q w must 
be replaced by an integration, the element dQ being multiplied by the 
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factor cr(//"', Q"'), and Q n being replaced by the element dQ H with the 
factor o(H', Q ") on the right side of (184 a). 

If there is a slight direct coupling between the states //' and //", 
then the transition probability is determined by the sum of 
and A 2 c /r/r , so that instead of (184) w T e get 


1 


ii "ir 


4n 2 

h 


si 




S]rrr-S)r;r 

ir-H ' 


(184b) 


It often happens that the perturbation is due to the simultaneous 
action of two different forces—which are incoherent with regard to each 
other—in the sense that they involve independent phase-factors, over 
which one must average, with the result that all quantities containing 
odd powders of these factors vanish. 


We thus get 8 = F-\-G, (185) 

and (185a) 

the average value of the product of F { } rir with G l f* jr being equal to zero. 

If we consider simple transitions H' — II" produced by the simul¬ 
taneous action of two such perturbations, w ; e get for the transition 
probability the sum of the two probabilities, corresponding to the action 
of ea^h of the two perturbations taken separately. 

However, in the ease of combined transitions, we get, according to 
(184), the follow ing expression for the transition probability 


W = (F, F) wn . + (F, G), rn . 4- (G, G) [rir , (186) 

the first and last terms being obtained from (184) by replacing S by 
F or G. They represent the ‘solo’ action of the two perturbing forces, 
while the middle term represents their combined action, one of the 
perturbing forces producing the first and the other the second step of 
the transition. This combination term 


(F, G) irll > 


4r v 

it 2 Z, ir-ir | 


(ir = ir) 

(186a) 


turns out to be, in many cases, more important than the two ‘pure’ 
terms. 

These considerations acquire a particular importance in the generaliza¬ 
tion of the preceding results for the case of a perturbation depending 
upon the time. 

Let us first assume that S reduces to a simple harmonic vibration 
without a constant term. We then have, as before, 


$ — T(x,y,z)cos(2TTvt-\-f$). 
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Substituting this in equation (173 a), we get the former expression 
(183) for A 2 c H m H' with 

t 

= -£ J at' 

0 

r 

-f j dt" {e i[2 ' n(y n"'n' +$-f- e i[2n(v B „> E ,j 


i.e. 


1 

4h 2 


giVlTTiv U ; U ,+2v)L) _ l 


— e 1 ’ 2 ^- 


e i[2,ir{v n „ a ,n+v)l )— J 


+ : 

+ 

+ e“^ 


{ v H m H' J r < 2‘ v )( v n"'H ,J r v ) ( v irH ,,,J r v )( v ir , H' J r v ) 

e i2nv wtR 4__ } e i2n(v n „ E , t ,+v)t__ \ 


) + 

v + 


'- 1 . ]. 
»n— v )l 


—*') ( v H'H‘" J r v )( v H"'H— y ) 

e 1 i2irv R n a ,l — | gi2Tr(y n „ n ,„~v)t — \ 

V 11 '+ v ) ( v JI -II - V )( V H'"H ' + v ) 

e i2TT(v B ., HI -2v)l_ 2 _ i2 ^ e i27T(v jrii ,„-v)l__ J 

(v H *i r —2v)(vi r " H '--v) ( v irn — v )( v rr 

This expression clearly shows that the resonance condition v irir =- ±v 
(i.e. H n --H' = ±7^) of the theory of simple transitions has to be 
replaced in the case of double transitions by the condition 
Vh-w = ± 2v or 0, 

that is, H — H' = ^2/ty or 0, 

giving respectively 


fivir 


'IV 


h H ,u 


Z ±i2 H ITT ' 

-H'±Kv or h\}T 


■ + 


1 


H'+hv 1 H"'—H'—hv 


t. 


These results can easily be interpreted by assuming that each step of 
the double transition Id' -> Id'" H" consists either in the absorption or 
in the (forced) emission of one quantum hv of light —if, for the sake of 
concreteness, the perturbation S is regarded as due to monochromatic 
light of frequency v. 

This interpretation is supported by the fact that the transition 
probability as determined by the square of A z c irjr turns out to be 
proportional to the square of the intensity of the light (i.e. to the fourth 
power of the electric force E 0 > to which S must be proportional in the 
case under consideration). This is just what would be expected if the 
probability of each of the two steps of the transition is proportional 
to the intensity of the light. 

It must be emphasized, however, that for each of these two steps 
the usual resonance condition v H ^ a > ~ is, in general, not satisfied. 
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We have here the same situation as in the case v = 0 discussed above— 
an apparent violation of the energy principle, straightened out by the 
perturbation energy whose value is actually indeterminate. 

It is, in principle, quite possible for light to induce transitions whose 
probability is proportional not to the first but to the second or even 
to a higher power of its intensity. In order that such effects could be 
observed, however, the intensity of the light must be extremely high, 
in fact much higher than that with which we usually have to do in our 
laboratory experiments. For, according to these experiments, the 
transition probability, as measured by the rate of photo-ionization for 
example, turns out to be exactly proportional to the light intensity. 

Wc are thus entitled to conclude that double transitions produced 
by the action of light alone practically do not occur—on the surface of 
the earth at least. 

There is, however, a great variety of phenomena which can be 
described as double transitions under the combined action of light and 
some other perturbation which does not depend upon the time. 

Such combined perturbations are represented by a function of the 

8 = T(x,y } z)co$>(27Tvt-\-p)-{-G(x,y,z). (187) 

If, in the calculation of & 2 c h"h *, only those terms are preserved which 
are bilinear in T and G , i.e. proportional to their product, then, instead 
of (183), we get 

(187 a) 

with 


fn-n'-H'® 

_ _ 2rr* 

h 2 

and 


i r 

j dV { € J dt" e i27rv H" 


^ __2 n 2 

h 2 


t 

J' dt' e i2rrv n"n'" i 
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V 

j* dt” 


0 


that is 
fu-H n 'H '(0 — 


1 { .Q e i27Tiv B"n>+rt— 1 e i27r(v i?"^" +v)< — 1 

—-- e l P . ...f~ 
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+e~ l P~ ---e-v 

( v irn — 


( V H*H - 



232 

and 


PERTURBATION THEORY 


§23 


1 ( ,o e iWvn>'H‘+'V—\ . R e i2nv B"U"> — 1 

9irir"H'(t) ^72 \ e ( TT\/ -T / . ^ + 

{ \ v h-h'~t v )\ v ir‘ir +W v lrir \v ir , jr -f v) 

e i2Tr{v H „ B .-v*_l e i2TTV a ., H '"t_ 1 | 

4- e f P — e~ l P 

(*'//-//• - v ){ v ii"'ir ~ v ) v H-ir\ v a"ir v )J 

The two expressions define the resonance condition in the same way 
as for a simple transition produced by the action of the light alone. 

In the case of an unsharp resonance in the neighbourhood of the 
value //" ■ ir±hv 9 these expressions practically reduce to 


firir"ir( 0 — o7^ e ( :J - ; - 

v ) v ir"ir 


so that we get 


1 e i27r(^,lvX_ 1 

Uirir'iiv) r ” / 'V w -r- \ 

Zh (v 77 " 77 ' = pi')(i , 77''7/'~ T -r) 


(187 b) 


i Np //*//"' (*%'"]{' , G*)r,ir‘ Tn"'tr\e 

A \ iv"-n ' ^ /r- 


iv" -ir 


IV"-II 


r //" //' u 

'TW 


ir-ii'i-kv * 


and consequently 
IVrr - 


7 *o /i»u 770 7*0 

J U*u"' Kl n "rr , o//.*//'" 1 w 


irn"’ yj Li“!i 


H" r -H 


1 \ , 

n"ir \ 

' ~Vhv ) 


instead of (184). This formula should be com])leted to allow for transi¬ 
tions through states belonging to the continuous //-spectrum, and also 
for other parameters (Q) besides the energy, in the same way as (184). 

It must be mentioned that those terms—quadratic in T or G — 
which have been dropped in formula (187a) have no importance so 
long as we restrict ourselves to resonance transitions of the above type. 
As shown above, they would become predominant only for transitions 
of the type II" = H f ±2hv or IV = //'. 

An interesting feature of the expression (187 c) is the non-symmetrical 
character of the two terms in the brackets with regard to the frequency 
v. The latter affects the second term only, which corresponds to the 
action of light in the first step of the transition, while in the first term, 
which corresponds to the action of light in the second step, the fre¬ 
quency v appears only through the subscript //". 

As an example of the application of the formula (187 c) we could cite 
the problem of the transformation of light into heat in gaseous bodies. 
In this case G must represent the perturbing force experienced by the 
atom under consideration due to other atoms with which it is sup¬ 
posed to come into collision. The complete treatment of this problem 
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requires, however, the generalization of the preceding theory to allow 
for the motion of all the particles which act on each other (see Part ITT). 

Another example of double transitions of the above kind is provided 
by the phenomenon of the scattering of light which can be considered 
as a combination of two elementary acts (simple transitions)—namely, 
the absorption of a light quantum hv and the spontaneous emission of 
another light quantum hv corresponding, in general, to a different 
frequency. The two acts may take place in either order—since the law 
of the conservation of energy need not be satisfied in the intermediate 
state (if the perturbation energy is left out of account). 

The application of formula (187c) to the case of the scattering of 
light necessitates, however, two important amendments both in the 
underlying principles and in the form of the result. 

First of all it is necessary to visualize a ‘spontaneous’ transition, 
associated with light emission, as caused by some perturbation G —the 
reaction of the electron’s radiation field on itself, for example (see 
Part I, § 18). This question lias, however, no practical significance, 
since in formula (1S7 c) we have to do not with the perturbation energy 
G itself—which cannot be specified in the usual way, i.e. as a function 
of the coordinates or as an operator G(x) —but with its matrix elements 
only. The latter, however, can be regarded as known, since they 
define the emission probability for which the expression (93), § 17, 
Part I, can be used. Identifying this expression with the expression 
4 t 7 2 i 6 y //-//'! 2 cr ( IT")jh , we can determine the matrix elements of G pro¬ 
vided the function a(IT") is known. 

We shall not investigate this question here, for it will be considered 
in detail later in connexion with a more direct theory of light-scattering 
It must be mentioned, however, that this theory leads to a formula for 
which differs from (187 e) in two respects. 

Firstly , the resonance condition HI = H'±l\v is replaced by 

H’l = H'-rhv—hv'; (188) 

where v is the frequency of the absorbed and v the frequency of the 
emitted (‘scattered’) light. This result can be considered as the direct 
consequence of the energy principle. 

Secondly , taking the sign — in the denominator of the second term 
in (187 c) (which corresponds to absorption of light), we must replace 
the denominator of the first term, i.e. the difference by 

H m —H'-\-hv' (which corresponds to the emission of light of frequency 
v ' in the first step of the double transition). We thus get for the 

35»fi.6 H b 
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probability of scattering, instead of (187 c), the expression 


77-2 V ( T° H 'y H'" i HTh'"H' \ 2 Qgg £ 

h ,4 \H'"-H'+h v ' ^ ir-H'-hv ) ' 1 


If the incident light is polarized in the direction of the unit vector q 
and that part of the scattered radiation is considered which corresponds 
to vibrations of the electron in the direction q', then we must put 

T=-e(rq)E 0 and G = -<?(r-q')£', (188b) 

where r is the radius vector of the electron (with respect to the nucleus 
of the atom) and E^ is a certain ‘effective amplitude’. G is thus obtained 
from T by replacing the amplitude of the external electric force by 
a certain constant, which will be determined later. 

These results can be derived from the general perturbation theory by 
replacing the spontaneous emission forming one of the two steps of the 
scattering process by an induced emission, i.e. an emission due to the action 
of a secondary light wave with the frequency v and the amplitude E^. 

Assuming the electron to be exposed simultaneously to the action of 
these two light waves, we have for the total perturbation energy an 
expression of the form 

S = T(x i y i z)cos(27rvt+p)+T'(x,y i z)cos(27rv , t^-^). (189) 
This gives for the bilinear part of A 2 c irir the previous expression 
(187 a) with G = T' but with somewhat different values for the factors 
/ and g. 

Limiting ourselves to the case of an approximate resonance in the neigh¬ 
bourhood of the value (188) and dropping relatively small terms, we get 


w„«> - 




_L e «j 9 --j 9 ) 


\n{v u „ H .-v+v')t_ 


which gives 
A 2 c h * h - 


(v JI -H—v+v')(v ir . lr +v') 
e i2*iv B n n ,-v+vy__ X 

far jr v + v ' ) far tt — v ) 


(189a) 


4h% tff. \ V H ... H —V ) V H : H , — V+v’ 

(189b) 

and consequently 

p _ ^*1 V (Th 0 H'" '^V"JT , vr'I 2 

l * m *' ” l Z + ’ ( c) 

1 U 1 

i.e. exactly formula (188 a) with G replaced by T'. All that remains is 
to assume a fixed effective value for E' 0 in order to obtain the probability 
of scattering. 
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This value can be determined in the following way: 

The unsharpness of the resonance implied in the preceding calcula¬ 
tions can be realized either by a transition of the particle into a ‘band’ 
A H* of a continuous spectrum, with exactly specified values both of 
v and v\ or by a transition into a perfectly definite state H” belonging to a 
discrete set, the unsharpness of the resonance being due in this case to a 
variation of v in a small interval Ay' about the value ( H " — H'—hv)jh 
or, in other words, to the emission of a spectral line v of finite width. 

From the latter point of view, which we shall adopt for the present, 
we must consider instead of S' = T' cos(277v7+/?'), a superposition of 
a set of harmonic vibrations with different frequencies contained in the 
small interval Av' and with completely independent phase constants, 
i.e. incoherent with regard to each other. 

This means that must be proportional not to the square 

of the sum of the amplitudes of the component vibrations, but to the 
sum of the squares of these elementary amplitudes. Denoting tli^ value 
of this sum for all the frequencies contained within the interval dv' by 

E ' v ’ dv '’ WC gCt \T?nr -\ 2 = e 2 l(r • q'WW 

or if—as has been done above—the integration is extended over the 

values of the energy and not over the frequency, 

jKr-q'hrH-m- (190) 

The corresponding transition probability is equal to 




= ^1 (rq'W'W- 


This quantity must obviously be identified with the probability of 
spontaneous emission (see Part I, eq. (93), § 17) 




whence it follows that 


E ? = *~r h *' 3 - 


(190a) 


Putting further T — —er • qE 0 

(q being the direction of the vector E 0 ), we get, according to (189 c), 




?e* 2 


( r g*fl"' •q)( r g" , H'' Q ) i ( T irrr" ' Q) 


H”-H'+hv' 


•q'Krg-H-q) l 

-H'-hv J 


(190 b) 
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The intensity of the scattered radiation is equal to the product of 
T irir and hv . 

If v is different from v, and if a direct transition from the state H' 
to the (discrete) state H" is impossible—as assumed hitherto—formula 
(190b) describes, in conjunction with the resonance condition (188), 
the so-called Raman effect or incoherent scattering of light. If the 
state H" belongs to a continuous set, corresponding to an ionized state 
of the atom, we get the Compton effect instead of the Raman effect. 
In this case it is necessary, however, to modify formula (190 b), firstly 
by allowing for transitions through intermediate states belonging to the 
continuous spectrum, and secondly by allowing for the finite speed of 
light both in absorption and emission. These corrections will be intro¬ 
duced later in Part III where an exact theory of the Compton effect 
will be given. 

24. Theory of Transitions for an Undefined Initial State 

The coefficients c n —or in particular c jrII —are complex quantities, 
whose modulus determines the probability of the corresponding states 
—or the number of copies associated with the latter—while their phases 
have no direct physical significance. 

We shall see later that these phases can be used for the building up 
of a theory, in which the copies of the particle appear as a number of 
particles of the same sort (cf. Part I, § 20). So long, however, as we 
confine ourselves to one particle only, the phases of the quantities c ir 
are devoid of all meaning and must therefore not appear in the final 
equations. This means that the latter must contain only the moduli 
or the squares of the moduli of the coefficients c ir . 

We shall apply this principle to the problem (first treated by Dirac) 
of the change in the clistributon of the copies of a particle among 
different states due to a perturbation of any kind when the state of the 
particle at the initial instant was not exactly specified, so that only 
the initial values of the probabilities N {) n > were known. Our problem 
will consist in the determination of these probabilities N n \t) as func¬ 
tions of the time (for sufficiently small values of the latter). 

In this form the problem is indeterminate, for the equations of the 
perturbation theory involve not the probabilities N jr> but the proba¬ 
bility amplitudes c ir , whose values, both with respect to modulus and 
phase, are determined by the values of their moduli Vi\ T J r and phases 
at the initial moment. In order to get rid of these phases, which 
are completely irrelevant so far as the probabilities are concerned, we 
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can average the results over them —assuming all the values of these 
phases to he equally probable. 

Taking the case of a discrete set of states, we have, according to (161), 

dcjj' _ o 

dt 

To these equations we shall add the conjugate complex equations 

hi lit = | = J- 8 irn<- 

Multiplying the former by c* r and the latter by c JV and subtracting 
one from the other, we get 
h d 


27n dt 


( c n ,c u ) — 2 ^H'H nC lr c ii — Sirir c *r c n')> 


i.e. ~ Nji * — — ^ ($11 -H ' c /r c h' ~~ c /r c i? *) • (*91) 

We see that the right side of these equations cannot be expressed as 
a function of the numbers N ir . 

One might be tempted to put 

c ir = e'yu- 

and average over the phases y ir (and y /r ), considering all their values as 
equally probable. This would, however, reduce the right side of (191) to 
zero. In fact, we are not allowed to assume the equal probability 
of all the values of the phases y H > at any time; if they were equally 
probable at the instant t — 0 they will no longer be so later on. 

We shall therefore, in the right side of (191), substitute for the 
probability amplitudes e n > approximate expressions in terms of their 
initial values—up to the first approximation, so as to obtain the second- 
order approximation for the time derivatives of the numbers N ir (it 
should be remembered that the matrix components of S by which the 
coefficients c are multiplied are regarded as small quantities of the first 
order). 

We thus get 

c H 0 °ir ~ 

Now we obviously have 

&i c n* = Ai c je rn'" c H'"9 (191 a) 

iF’ 

so that 

C H* C H' ” c ir c /r+ ^ (Ai c*}*. Aj c u' n "’ Cj*). 

H 

If now we put c° H > = VA 7 ^ 


(191b) 
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and average over the values of the initial phases y° H > f yj r , etc., regarding 
them as independent of each other and equally probable, we get 

cf r c ir = A x c* rir A x c nir A T /r> 

or since A x cj. /r == —A x c irH ^ 

c *r c ir ~ 0^2) 

Substituting this in (191) and remembering that 

/ 

Ai c H'jr ~ ~~ J &irir dt, 

0 

we get 
dN, r 
dt 

“ ~h? 2 J Sirn'i 1 ') dt’ +Sirw(t) J S frrr (t') N° w ) t 

' 0 u ' 

d V 

that is, Ay^), (192 a) 

,7 4t r 2 / 2 

with r /r// . = £ f s, rn .(t') dt’ , (192 b) 

0 

which is obviously nothing else but the probability (per unit time) 
of a direct transition from the state IV into H n or vice versa. Equa¬ 
tion (192 a) could be obtained directly from the symmetry relation 
r inr — ^irir- It is easy to obtain higher approximations for dN jr /dt, 
taking account of combined transitions. This would not affect the form 
of equations (192 a). Instead of (192 b) we should, however, obtain the 
following expression for the transition probability: 

d 4t r 2 r r r 2 

Thi1 ’ = dt~h* S dt' + i j dt' s lra .„{t') J dt" s, r . n .(t") . 

o 11 0 0 

(192c) 




VI 

RELATIVISTIC REMODELLING AND MAGNETIC 
GENERALIZATION OF THE WAVE MECHANICS 
OF A SINGLE ELECTRON 

25. Simplest Form of Relativistic Wave Mechanics 

All the developments of the preceding chapters were based on Schr6- 
dinger's wave equation for a single particle moving in an external field 
of force with a given potential-energy function U(x,y,z,t ). 

This equation, as we have seen in Chap. I, corresponds to the pre- 
rdativistic classical mechanics, which neglects the variation of the mass 
of a particle with its velocity. In addition it does not take into account 
magnetic forces, which depend not only upon the position of a particle 
but also upon its velocity (being in fact proportional to the latter). 

Our next problem will be to find the improved form of the funda¬ 
mental equation of wave mechanics for a single particle—which we 
shall think of as an electron—that will take account both of the 
variability of mass and of the magnetic forces. 

It turns out that the two parts of this problem can be solved simul- 
taneousty—at one stroke as it were—if in reforming the SchrOdinger 
equation we let ourselves be guided by the basic principle of the 
relativity theory, namely, the equivalence of the space coordinates and 
the time (multiplied by ic), which must be expressed by the symmetry 
of all the fundamental equations of physics with respect to both, and 
which entails the four-dimensional character of all physical quantities. 

It should be mentioned that the same principle can be applied to the 
problem of improving the equations of the classical pre-relativistic 
theory and finding their relativistically correct pre-quantum form. 

The formal correspondence between the energy-momentum relation 
of Newtonian mechanics 

±(9l+gl+9l)+U-W = 0 (193) 

and the SchrOdinger equation written in the form 

[■^(Pl+Pl+PD+V+Pty = 0, (193 a) 

with 

h d h d __ A c __ A d 

Px ~2nidx' Pv ~ 2m 8y' 2 Indz' Pt ~ bindi’ 

(193 b) 
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leads us straight back to that four-dimensional representation of physi¬ 
cal quantities, which is the formal content of the relativity theory. We 
must, therefore, so modify our original equations that they assume a 
symmetrical form with respect to the components of four-dimensional 
vectors appearing therein. 

If, as will be done in future, the time is specified in the usual way, 
i.e. by the real quantity t without the imaginary factor ic, this sym¬ 
metry will be slightly distorted by the appearance of the factor —c 2 
or — 1 /c 2 in the product of the fourth components of any two vectors. 

To begin with, we must fill up an important gap in the usual defini¬ 
tion of the momentum-energy vector 

g x --=mv x , <j u — mv r < 7 .- mv zl —g t W (103 c) 
—a gap which makes this definition inconsistent from the point of view 
of the relativity theory and which limits its correspondence with the 
operator-vector (193 b). 

In Einstein's mechanics of a particle with rest mass vi {) we have, 
corresponding to the components of the momentum, i.e. 




mv u 


_____ „ m ° v » _ 

V(i-y 2 /c-)’ . v jrr^v*jc?j 

as fourth component of the four-vector concerned, the ‘proper energy’ 
m Q c 2 ( . m n ic \ 

1 — v 2 jc 2 ) \ \( 1 — V^jC")) 

Now the quantity p t in (193b) represents, not this proper energy, 

but the total energy E = wc 2 + U diminished by the constant rest-energy 

m 0 c 2 . For the relativistic formulation of the laws of corpuscular 

mechanics we must clearly add this constant to the energy W, i.e. we 

must put v lir , 0 o , TJ 

— 9t =-- L = \\ +Wo c " mc 2J i~U. 


me* 


In addition to this, we must regard the potential energy U as the fourth 
component, i.e. as the ‘time-projection’, of a certain four-vector and 
also take into account its space projection. This space projection G, 
which obviously corresponds to the momentum and which, just as 17, 
can be an arbitrary function of the coordinates and the time, will be 
called the potential momentum . In the—so far exclusively considered— 
special case G = 0 the components of the force acting on the particle 

reduce to the usual expressions — — ”, The question as 

to the nature and the mathematical expression of the force due to the 
vector function G will be considered later on. We are at present only 
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interested in the fact that, by the introduction of the ‘potential 
momentum’, the quantities g^, g y , g z appearing in formulae (193c) must 
be defined as the components of the total momentum wv+G just as the 
quantity — g t denotes the total energy mc 2 + U . 

We obtain, therefore, instead of (193 c), the formulae 

g x = mv x + G x , g u = mv y +G y , g z = mv l +G l 1 
-g t =^mc 2 -\~U I 

The components of the ‘proper energy momentum vector’ are related 
to one another, according to definition, by the relation 

(mv x ) 2 +{inv u ) 2 +(mv 2 ) 2 —^(mc 2 ) 2 — —m 2 c 2 (194a) 

/which is equivalent to the formula m — — Wn 0 - In the case 

\ ^( 1 —r 2 /c 2 )/ 

G == 0 this relation can be written in the form 


(mv x ) 2 +(?nv y ) 2 +(mv z ) 2 -~-- 2 (E-~-U) 2 = —m 2 c 2 . 


In the limiting case of small velocities (vjc 1 ) we can put approxi¬ 


mately 


and - w ( E-U ) 2 

C 4 * 


(mv) 2 ^ (m 0 v) 2 

A m o c2j r W—U) 2 ^ ??? 5 C 2 -f 2 m 0 ( IF-— U). 


Thus the previous equation reduces to 

(m 0 v x ) 2 +(ni 0 v y ) 2 +(™ 0 v z ) 2 +2m 0 (U --W) = 0 , 

which is the classical energy-momentum equation (193). It should be 
noticed that it expresses the ‘law of the conservation of energy’ when 
W (or E) is constant, which can only be the case when the function 
U is independent of the time (static field). 

We see therefore that the equation 

(9 x -G x ) 2 M9 v -G u r+(9-G,) 2 -^9t+U) 2 +™W = 0. (194b) 

which results from (194), and (194 a) represents the relativistic genera¬ 
lization and refinement of the Newtonian relation (193). 

From this equation we can go over to the corresponding fundamental 
equation of the relativistic wave mechanics in the same way as in the 
non-relativistic case—namely, by replacing the vector g in (194b) by 
the corresponding operator-vector p and equating to zero the result 
obtained by the application of the resulting operator to a wave function 

3595.6 t ; 
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\p. We thus get 
with 


D\jj = 0 , 


§25 

(195) 


+ 


\2vi8x x ) + \ 27ridy v ) 

+ (2_ 2 _ Q _ i (A 1 + u) 2 + m # c 2 . (195 a) 

T \2tti ft? 7 c 2 \2irt 8t~ / 9 

In the case of ‘multiplication’ of expressions which, besides ordinary 
quantities, also contain differential operators, the order of the factors 

d 

must remain unaltered. Thus the ‘product’ — G x \p where the operator 

dx 


djdx is to be applied to the function G x ip standing on its right side 

d dG 

differs from the ‘product’ G x —ip by the additional term 

If we take this into consideration we obtain 


(JlI 

\2rri dx 




h 2 d 2 ip h n d h d a . rj2 . 
= ite G *++ G *+ 


h?_ dhjj 
' 4 tt 2 dx 2 " 


h g dip 
rri x dx 


2-nV dx xY 


and similar expressions for the other terms in the equation. Written 
out in detail it runs, therefore, as follows: 


d 2 ip d^p d^p^i fop 

dx 2 + dy 2 + ~d^'~c 2 dt 2 


4rri 

~h 

27 ri 

T 

4t r 2 
h 


f 


d0 ^dG,-dG z ld_U\, 
"T “T ^ ' C 2 et y 


dx dy dz 


(Gl+Gl+Gt-±U*+mlc*y 


0 


(190) 


If the rest-mass vanishes (m 0 — 0), and if there are no external forces, 
i.e. in the case of an Einstein photon, this equation reduces to the 
equation = 0 

dx 2 dy 2 dz 2 c 2 dt 2 


for electromagnetic waves. Further, it can easily be shown that when 
m 0 0 and G = 0 the relativistic wave equation (196) for the special 
oase of a harmonic vibration process (i.e. motion with a given constant 
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energy) agrees with the relativistic equation (48 b), § 13, Part I. 
In fact, if we put dU/dt^O and $ = ip 0 (x y y, z)e~ 27Tivi f equation 
(196) reduces to the form 

/ 47TV 2 


vv+l 

or, with v = e/h, 


> v r/ + 4 ^ u». 
he 2 ^h*c 2 


47T 2 2 „ 


')* = 0 . 




- ?7) 2 —m^c 4 ]i/f = 0, 


(196a) 


which is identical with (48 b), Part I. 

We shall now investigate the relation of equation (196) to the equa¬ 
tion of motion of Einstein’s mechanics. For this purpose we shall put 

in * 196 ) <ji = const. e 2 *'*" 1 . ( 197 ) 

After dividing the result by (2Trij'h) 2 e i27TSIh and dropping the terms which 
contain the small factor h/Siri we obtain the equation 


(dS\*(dsy (dS\* 1 (BS\* j r es es as ,uas\ 

(&r) +y +(&) - 2 \ o *te+ o *%+ G 'to + #n) 


+ 


+ G%+Gl+Gl-~U*+mlc* = 0 , 

which must obviously be the relativity form of the Hamilton-Jacobi 
equation. It can be written more briefly in the form 

(197 a) 

and can be obtained directly from (195) if we replace the vector p in 
D by the vector g defined according to the equations 

8S 8S 88 8S 

**“to’ 9v== W 9 ° = Tz’ g ‘ = Tf (197b) 

From these equations, which refer to the copy continuum of one 
particle, one can easily go over to the relativistic equations of motion 
of a given copy and, indeed, just as in the non-relativity theory, by 
differentiation of equation (197 a) with regard to the coordinates and 
the time, bearing in mind the following relations resulting from 
(194) and (197 b), 8g 8S 

-~G x = mv x , .... - 

If we differentiate (197 a) with regard to x and divide by m, we get 


- + U = 


-mc‘ 


IPS 

\to a 


ML 


a*)"' 
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18 2 S 

8G y \ 

\dxdy 

8x) 
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(d*S 

8GA 

\dxdz 

8x) 


+ ^+^ = 0 

dxdt^ 8x 
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or, by (197 b), 


= t > dQ r 

dx dt dy dt dz dt ' dt x dx 


jl v , 

+ v *dx +v * 


dGz _ dU 

dx dx 


i.e. 


§£ = ±(vG-E0. 

dt dx v 


(198) 


The three-dimensional velocity vector v referring to a definite particle 
is here no longer considered as an explicit function of the coordinates 
and the time. Therefore, its partial derivatives with regard to x, y, z, t 
must be put equal to zero. 

The equations for g u and g 2 analogous to (198) will not be written 
down here. The fourth equation runs 


If the potential functions G and U are independent of the time 
(static field of force) this equation reduces to dg t jdt = 0, i.e. —g t ~~ E 
= const, (law of the conservation of energy). 

If we split up g x in (198) into the sum of mv r and G x> we then get 


<*£x 

dt 


_ ±, mr \ 1 , ' dG s , dG x dx 8G X dy 
~ dV x> + dt + ~<>x dt eyj dt 
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d . x . dG x dG x dG x 
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dz ’ 


and consequently, 
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(mv z ) = - 
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v ( 8( ' x 

dx 
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\ dx 

<>y! 

*(as 


dt 

The right side of this equation must obviously represent the ^-com¬ 
ponent of the force f acting on the particle. 

If we put 


with 
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(199) 


(199a) 


(199b) 
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we obtain 


or in vector notation 

and 


f x = e(E x + V -fH z - V ^H y y 

{ E + e %B > 

(mv) = f. 


'-I 


(200) 
(200 a) 


Here </> and A are the scalar (electric) and the vector (magnetic) 
potentials, E and H the electric and magnetic field strengths re¬ 
spectively, while e is the electric charge of the particle. A point-like 
corpuscle can thus be defined by two constants only—its rest-mass and 
its charge. 

The vector defined by (200) represents, therefore, the external force 
(so-called ‘ Loren tz force’) acting on an electron or a proton which is 
moving in an arbitrary electromagnetic field. 

The time projection of the four-dimensional equation of motion, of 
which (200a) is the space projection, has the form 

— eE ■ v = e(E x v x +E y v y +E z v z ). (200 b) 

We thus obtain the relation 

^K) = f . v = v i(« v) , 


from which at once follows the well-known formula 


m 


m 0 

V(l-« 2 /c 2 )‘ 


It still remains to find out the expressions, corresponding to the 
relativity wave equation just considered, for the quantities p (proba¬ 
bility density) and j (probability current density). This is done most 
simply as follows (according to W. Gordon). We first introduce the 
operators: 


T 2rri dx 



Uz 


h d e£ 
2ni dz c z ’ 



( 201 ) 


by means of 
the form 


which we can write the relativistic wave equation (195) in 
(ul+ul+ul—uj+mlc 2 )ip = 0. (201 a) 


We multiply this equation on the left by 0* and subtract from it the 
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conjugate equation for p* multiplied by p. We then get, bearing in 
mind (196), the formula; 


dx\ dx r dx 


4:iri 

IT 



dip 

dy 





+ 
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d_ 

dz 





l iLtW-ft'+izi 

c 2 dt\ r dt Y dt^ h 



= 0, 


or 


oz ect 


This formula can be regarded as the equation of continuity if we define 
the quantities 


1 

2m, 0 


W'X 


(201 b) 


p = ~2. 


as the components of the current-density vector and the copy density 
respectively. With regard to the first, this definition is the immediate 
generalization of that given earlier. The expression for p, on the other 
hand, seems to be completely different from pp* which has been used 
so far. We can easily convince ourselves, however, by the example of 
a conservative motion, that this difference is, in practice, quite unim¬ 
portant. Putting 


we obtain 


h dp 

2771 dt 


-Ep, 


and 


h dp* 

2771 dt 




= —-M*(E-U) = 
c 


— ~ pp*(m 0 c 2 -f W— £7), 


and hence p — pp*(l + ———V (201 c) 

\ ™o c ] 

i.e., in so far as the kinetic energy W — U is small compared with m 0 c 2 , 
p ~ pp*> 

With regard to the exact meaning of pp*, one can easily show that 
it corresponds to the rest density. This can be seen, for example, from 
the relation pj(pp*) = mjm 0 which is obtained from (201 c) if the mass 
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m is introduced by means of the usual formula 

, W-U 

m = m 0 + — . 

c* 

26. Magnetic Forces in the Approximate Non-Relativistic Wave 
Mechanics 

If in reducing equation (196) to the form corresponding to conservative 
motion the potential momentum is supposed to be different from zero, 
we get instead of (196 a) an equation which in vector form can be 
written as follows: 

~k {E ~ U)i ~ mlci] }' p = °- (202) 

If the energy IF = E—m 0 c 2 is small compared with the rest-energy 
E 0 — m 0 e 2 , which classically corresponds to motion with a velocity v 
small compared with the velocity of light, then we can put with suffi¬ 
ciently good approximation 

{E-Uf = (E 0 + W—U) 2 = El-j-2E 0 (W~U), 
neglecting the relatively small term (IF— U ) 2 , and thus replace equation 
(202), which is supposed to be exact, by the approximate equation 

|(A.V-G^-2to 0 (1F-17)|^ = 0. (202 a) 

This equation corresponds to the classical equation of motion allowing 
for the presence of magnetic forces (derived from the constant potential 
G) but neglecting the relativistic variation of the mass with velocity. 

As a rule, the magnetic forces are relatively weak, so that the terms 
of (202 a) which are quadratic in G can be neglected compared with 
the linear terms. With this condition, equation (202 a) reduces to the 
still simpler form 

((sS 7 ) , -S5 V - G - 0 -SS v - J ^ H '- t '»)*-°- 

Now we have V • Gip — div Gip — G * V^r-f ip div G. 

It is well known further that in the case of a static field the divergence 
of the magnetic potential A vanishes, so that we have div G = 0. 
The preceding equation can therefore be written in the form 

|^(ffi v )'-sb G - 7+( '-"')' t - a <202b > 

So far we have been making perfectly permissible approximations. We 
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are now going to generalize the preceding approximate equations for 
the case of non-conservative motion (in a static or non-static field)— 
in the same way as was done before with G = 0, namely, by replacing 
the energy W by the operator —p t (or —p t ~m Q c 2 \ the constant term 
m 0 c 2 is immaterial in this case because it is absorbed by the potential 
energy). 

We thus obtain the equations 

[ 1 (Av'i 2 - .G-V+U+p]$ = 0 (203) 

|_2m 0 \2jn j 2nm n i ^ vi’tyr \ 

for weak magnetic fields or 

[AA- g )’ +£ H # “ 0 <203 * ) 

for strong fields; these can be considered as the generalization of 
SchrOdinger’s equation (193 a) for the case of the presence of magnetic 
forces, with neglect of the relativistic variation of mass with velocity. 

The transition from equations (202 a) and (202 b) to (203) and (203 a) 
is certainly an illogical step, which, moreover, is in contradiction with 
the results arrived at in the preceding section. For if equations (202 a) 
and (202 b) are permissible approximations of equation (202), which is 
supposed to be exact, referring to the case of motion with a definite 
energy, equations (203) or (203 a) cannot be considered as an approxima¬ 
tion, in the strict sense of the word, to the general equation (19G). In 
fact, the latter involves a second derivative of ip with regard to the time, 
which we are not entitled to drop or to replace by a first derivative 
multiplied by a constant factor—unless the dependence of i/j upon the 
time is given by the factor e ~ i2nEilh —corresponding to a motion with 
the constant energy E. 

We have here an approximation of a kind similar to that which is 
constituted by the Hamilton-Jacobi equation v r ith respect to SclirO- 
dinger’s equation for the function S == (&/277t)log^: in the latter case, 
however, it is the second derivatives with regard to the space co¬ 
ordinates and not to the time which have to be dropped. 

The preceding consideration does not, however, invalidate equations 
(203) and (203 a) as good approximations to the truth within a certain 
range corresponding to a negligible variation of the mass with velocity. 
Apart from the fact that the validity of the relativistic equation (196) 
still remains to be proved (and we shall see later that, as a matter of 
fact, the contrary can be proved)—equations (203), and (203 a) repre¬ 
sent a very natural generalization of SchrOdinger’s equation for the 
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presence of magnetic forces, and must therefore describe the motion 
affected by such forces just as well as SchrOdinger’s equation describes 
a motion unaffected by the latter. 

An important advantage of the ‘approximate’ equations (203) and 
(203 a) over the ‘exact’ equation (196) consists in the fact that they 
fit into the general scheme of the operator theory developed on the 
basis of SchrOdinger’s equation, since they can be written in the same 

form, namely, (H+pM= 0, (204) 


where the Hamiltonian or energy operator II must be defined by the 
generalized formula 

II == J_/Jtv-GY 2 +f7, 


or 


2 m Q \ 27 ri 

h = 1 (JlvY- 

2 m Q \2t ti J 


h 

27rm n 


.G-V+U. 


(204 a) 


(204 b) 


Equation (196), since it contains the square of the operator p,, can¬ 
not be written in the form (204)—unless we assume that it is possible 
to extract square roots of operators in the same way as of ordinary 
numbers and succeed in finding an equation linear with regard to p t and 
actually equivalent to (196). 

Leaving this question till a later section, we shall now indicate briefly 
the principal modifications of the general theory, developed in the pre¬ 
ceding chapters, which are necessitated by the generalized form of the 
Hamiltonian operator (204) or (204 a). 

First of all, we must notice that this operator is complex (which does 
not prevent it from representing a real quantity, just as the operator 
fi 

p = -— -.V does). We must distinguish therefore the operator H from 

the conjugate complex operator H*, which determines the conjugate 
complex wave function ^r* by the equation 

(H*—p t )ip* = 0. (204 c) 

Multiplying this equation by ip and subtracting it from equation 
(204) multiplied by ip* y we get 

dp 


dt 


+ divj = 0, 


with the old—non-relativistic—expression ipip* for p and the expression 

J=-i[**(s v - G )* + ' > (-s; v - G )' , ‘] <205 > 

for the current density. This expression turns out to be the same for 

9838.6 Kk 
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the two Hamiltonians (204 a) and (204b), and coincides with the expres¬ 
sion derived above from the ‘exact’ relativistic theory [cf. the first 
equation (201 b)]. 

Equation (205) can obviously be rewritten in the form 

j = JLpR(VS-G) = —p[VR(S)-G}, (205a) 

m 0 m 0 


where 


and R($) is its real part. In the approximation corresponding to the 
classical (Newtonian) theory of the motion of an electron in an electro¬ 
magnetic field, S is the action function and its gradient VS is the total 
momentum g. The difference VS—G thus reduces to the proper 
momentum m 0 \ and the vector j reduces to the product p \—just as 
in the absence of magnetic forces—as, of course, is to be expected. 

The complex character of the operator H necessitates the revision of 
some of the properties of its characteristic functions, which were 
established on the assumption that H was real. This refers, in the first 
place, to the orthogonality property which was deduced from the self¬ 
adjointness of H , i.e. from the formula 

j(fiHf 1 -f t Hf 1 )dV = 0. . 

Now in the general case of a complex R defined by (204 a) or (204 b), 
this formula does not hold and must be replaced by 

j(f 1 Hf,-f 2 H*f 1 )dV = 0. (206) 

We have, in fact, according to either one of the two definitions of H , 
hHh-UH'f, 

= - ss»„-sb G ' V ^ G • 7 " 

or, so long as div G — 0, 

fi H hS 2 H *A = div fjjj, (206 a) 

where f ia = - (A V/ 2 -/ a V/J - J~~. GfJ 2 . (206 b) 

If, therefore, the functions and/ 2 vanish sufficiently rapidly at infinity 
(so that the integral lf x f 2 dV converges), we must have equation (206). 
Putting, in particular, f x = and / 2 — where H' and R” are 
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two different (real) characteristic values of H , we get 

J dV = (H'-H') J dV = 0, 

whence J dV = 0 (H’ H'), 

as before. 

It should be mentioned that in the case of a real H (i.e. in the absence 
of magnetic forces), the characteristic functions neglecting the 
time-factor e~ i2nirtlh , can always be defined to be real, i.e. to have real 
amplitudes r,i '(z* y> z)> while in the case of a complex H these amplitudes 
are complex. The orthogonality relation holds therefore only in the 
above form, and not in the form 

J i/ifj’ t/i H . dV — 0 or J dV = 0 


in which it can be expressed if H is real. 

It should also be mentioned that the property of self-adjointness 
expressed by equation (206 a), refers not only to the operator II but 
to any operator which represents a real quantity, i.e. which is a real 
function of the coordinates and the elementary operators p x , p y , p z — 


or of the vector-operator p = 



This can easily be shown with 


the help of the relations 


fiPx’h-hpTfi = 

where 



. f d2n :\h 

Jl 8x in ~ x 


d A 82n rJL , d lL f 

8x dx iH ~ 2 ~*~ dx 2 dx tn ~ 3 " dx 2n ~ 1 ^ i> 


and 


flP 2n+1 fi+fiP U+1 fl = J/l2> 


with 


f - f d<lK h 8/x & n - X h 
Jn Jl 8x 2n 8x 8x 3n - 1 


8x * 8x 2n ~ 8 ”’~ r dx inJt 


in conjunction with 

— p%n^ l-l^^ — __2j2n-fl 

We can thus say that not only the energy operator, but any operator 
F representing a real physical quantity is self-adjoint in the sense of 
the equation divl„. 


and that the characteristic functions of this operator are orthogonal 
with regard to each other in the same sense as the characteristic func¬ 
tions of the energy operator. 
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Another result which was associated with the reality of H and its 
self-adjointness in the old sense was the possibility of replacing the 
differential equation for its characteristic functions and values 

= 0 

by the variational equation 

8 J dV = 0, 

with the condition J dV = const. (=1). 

Since, in the case of a complex //, the function tfj* no longer satisfies 
the same equation as i/j, the preceding results seem to require a modi¬ 
fication. 

As a matter of fact, however, no such modification is needed, for we 
have r r r 

B j dV = J dV + j i/f *H8ifr dV 

and, with the help of (206), 

J dV = J dV, 

that is, 

8 J dV = J dV + j dV = 8 J dV. 

The variational principle thus preserves its usual form 
BH — 0, E = const. (= 1) 

with H = J H J, dV = J dV, 

and E = J 

As has been already pointed out, the two equations BH ~ 0, BE = 0 
can be replaced by the single one BH = 0 if H is defined by the formula 

H = \ dV j j 4>*i A dV (or J dV j j </,*$ dF), 

without any normalizing condition for the function ijj. 

It should be noticed further that the two equations BH = 0, BE = 0 
can be split up, as it were, into the following two pairs of equations: 

J B^*H^ dV = 0, J Bf<f> dV = 0 ( 207 ) 

and J BijiH*i[i* dV = 0, J ^ dV = 0, 

the first pair being equivalent to the equation — 0 and the 

second to the equation (H*~-H')tf/* = 0. 
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The preceding result can easily be generalized for non-stationary 

“W = 0 being equivalent to 
2m ctj 


motion, the equation j H + 




H + 


h d 
2m dt 


and the conjugate complex equation |if* 


^dV = 0 , 


(207 a) 


h 8\ 

2iri i 


to 




( Ji d\ 

H - -; — 10 — 0 

2m dtJ 

and 30* is quite arbitrary, the variational equation (207 a) is nothing 
but a transcription of the ordinary differential equation of motion. The 
same variational equation is obtained, however, as the condition for 
the error involved to be permanently small ,f when 0 is replaced by an 
approximate function of some relatively simple form 0 X . 

At some initial moment t = t 0 the form of the function 0 can be 
fixed quite arbitrarily. We can accordingly identify i/j(t 0 ) with 0 1 (£ o ). 


Now i/j x (t) does not satisfy the equation 



tion of the form 



= 0 but an equa- 


(207 b) 


Our problem is thus reduced to that of making the additional term 
0 2 (£) as small as possible for any time t. Taking t = t 0 +dt, we get from 
the preceding equation 


0i(^o -\-dt) = 0 1 (<o) — 




Now if the function ipi(t 0 +dt) is altered by a small amount 80 1 ($ o +$), 
the function 0 1 (2 O ) remaining the same as before, the corresponding 
variation of the correction term 0 2 (£ o ) will be 

S02(W == —-.80i(f o +^)- 


The condition that 0 2 (^ o ) should be as small as possible for all values 
of the coordinates can be stated as the minimum condition for the 
integral J 0J(£ o )02(£o) dV and * s equivalent accordingly to the equation 

J dV - 0. 

t The argument presented below is taken from Dirac’s appendix to the Russian edition 
of his book, The Principles of Quantum Mechanics. 
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Replacing here ty*(t 0 ) by (f 0 -f dt), we get 

2tti 

f Btf(t 0 +dt)Mt 0 ) dv = o, 

or passing to the limit di -> 0 and dropping the index 0 (since the above 
results must hold for all values of t) 

\ Sip* (l)i/i 2 (t) dv = 0. 

This equation means that the correction <//«>(/) must be orthogonal to 
any variation of the approximate function 0 3 (O- Hence ifi 2 can be 
eliminated from the equation (207 b) if the latter is multiplied by 
and integrated over the coordinates, thus giving equation (207 a) with 
the exact function ip replaced by the approximate one ip v 

The expressions J ip*Hip dV and J ipH*ip* dV for 77 can easily be put 
in the symmetrical form 

s - / [ii^ v - G )e^ v ~ G >‘ +c,# '] (2 ° 8) 

if H is defined by (204 a), that is 

where p —- ifnp^ is the density of probability and j the probability 
current density as defined by (205). Using the approximate expression 
(204 b) for H , we get, instead of (208), 

B - \ •] iV - 

(208 b) 

which coincides with (208 a) if, in the above definition of j, we put 
G = 0, thus coming back to the old definition of the current density 

* = 

So long as the reality of the characteristic values of the operator H 
and the mutual orthogonality of its characteristic functions is unaffected 
by that change of it which corresponds to the presence of a magnetic 
field, we can preserve, without any modification, all the results of the 
preceding chapters concerning the matrix representation of physical 
quantities ‘from the point of view* of H, the transformation theory and 
the perturbation theory. 

If the magnetic (or, in general, the electromagnetic) field specified by 
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the vector G is relatively weak (compared with the field of force defined 
by the potential energy U ), then it can itself be treated as a perturba¬ 
tion. Subtracting from the Hamiltonian (204 b), which in future may 
be denoted by K , the usual Hamiltonian 

II = -}JJLyY+u, 

2m 0 \2m ) 

which corresponds to the absence of the ‘perturbing' forces specified 
by G, we get the following expression for the perturbation energy: 

£=- •—--G-V (209) 


znm Q i 


S 


ieh 


A-V, 


(209 a) 


where A is the vector potential corresponding to G (== eA/c) and e the 

electric charge of the particle under consideration. Putting p, 

2m 

we can rewrite (209) in the form 

S = -— A ■ p. (209 b) 

?n 0 c 

The simplest application of this formula is provided by the special 
case of the action of a permanent homogeneous magnetic field (Zeeman 
effect). Denoting the field strength by £*, we can, in this case, put 

A = Jfcxr, (210) 

where r is the radius vector of the particle. This gives in fact 

curl A = £>, 

as can be verified most simply with the help of the coordinate repre¬ 
sentation. 

Substituting (210) in (209 b), we get 

e 

2 ?7i 0 c' 

which can be rewritten in the form 

e 


8 -. 


; (.C>xr)p, 


S = 


2m 0 c 


$‘(rxp). 


Now the operator r x p — M 

obviously represents the angular momentum of the electron about the 
central point (nucleus), from which the radius vector r is supposed to 
be drawn. We thus get 

e 


S. 




6 M, 
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or 

S= -.0 j*. 

(210a) 

where 

(i = -- e M, 

2 m 0 c 

(210b) 


can be defined as the operator representing the magnetic moment due 
to the rotation of the electron about the (fixed) nucleus. 

This definition follows from the fact that (210 a) has exactly the same 
form as the classical expression for the energy of a particle with a 
(constant) magnetic moment fx in a homogeneous magnetic field ». • 

If the unperturbed motion is a motion in a central field of force, so 
that the vector M is constant, the vector |x will also be a constant. 
Its characteristic values are equal to those of M multiplied by e/2w 0 c. 
Taking the 2 -component of M and remembering that, with suitably 
chosen characteristic functions YJM) = W*. the characteristic 
values of M z are equal to integral multiples of hj'liT, we get for the 
characteristic values of jjl z integral multiples of the quantity 


Mi = 


eh 


which is called the Bohr magneton (since it is equal to the magnetic 
momentum of a one-quantum Bohr orbit). 

If the magnetic field is parallel to the 2 -axis, or rather if the latter 
is chosen in the direction of the magnetic field, then the change of the 
additional energy of the perturbed states of motion compared with that 
of the corresponding unperturbed states can easily be shown to bo 
equal to the product of § by the characteristic values of p c . In fact 
the non-diagonal matrix elements of the perturbation energy 

^nlm,nTm' 

with regard to the functions tllm and 4*n'Um' a U vanish (which means 
that the perturbation is of such a kind as to introduce no coupling 
forces between the pendulums representing different states), so that 
the additional values of the energy A H' reduce to the diagonal elements 
of the perturbation matrix. We thus have, in the first approximation, 


^3 ^ ~ ^1 ffnlm ^ nlm\nlm 


or A t #' = ——m. (211) 

^ttTUq c 

This splitting up of the energy-levels by the magnetic field—or rather 
the corresponding splitting of the spectral lines due to transitions 
between energy-levels with different values of the axial quantum num¬ 
ber m is called the ‘normal’ Zeeman effect. Since only such transitions 
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occur for which A m — 0, +1, or — 1, the normal Zeeman effect consists 
in the splitting up of each line into three lines, one of which coincides 
with the original line (corresponding to the absence of the magnetic 
field), while the other two are displaced in opposite directions by the 
amount - 

Av — — ------. (211a) 


47 rm 0 c 

The undisplaced line corresponds to harmonic oscillations of the electron 
parallel to the magnetic field, while the displaced ones correspond to 
circular motion in the one or the other sense about the direction of this 
field. The relative intensities of these three lines for the case AZ = +1 
and AZ — —1 have been determined in § 13, Chap. 111. 

We shall not discuss the Zeeman effect in greater detail here, but 
shall postpone this question until a later section where it will be dealt 
with in connexion with the complications arising as a consequence of 
the hitherto ignored ‘intrinsic 5 magnetic moment of the electron 
(‘anomalous 5 Zeeman effect). 

Although the preceding results have been * obtained to a first ap¬ 
proximation by the perturbation method, they can easily be shown to 
hold exactly —so long as the action of the magnetic field is represented 
by the (approximate) operator (209) or (210a). 

We have, in fact, denoting by <f> the azimuthal angle about the 2 -axis 
(supposed to coincide with the direction of the magnetic field), 

h d 




2iri d(p * 


and consequently S = (211 b) 

% 8<p 

where Av is given by (211a). 

If we now compare the exact equation of the electron’s motion 

(H+s-K')&. = o, 

with the equation = 0, 

corresponding to the absence of the magnetic field, we easily find that 
they can be satisfied by the same functions 

X°k- - tit = Ux'fiJW*, 


if we put K’-H' = A H' = h&vrn 

in accordance with (211). Thus, in the present case, we have 

A H' = A,//'. 

We shall consider, in conclusion, another method of dealing with the 

3595.6 L j 
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effect of a homogeneous magnetic field which is very instructive in that 
it brings to light the similarity between the wave-mechanical and the 
classical theory. 

We shall write the equation of the electron’s motion in the general 
form 

( 212 ) 


( h+s +bI)«=°- 


and shall introduce, instead of the original coordinate system x,y,z t 
another system, x\ y\z'( = z), rotating about the common (fixed) 2 -axis 
with a constant angular velocity w. The azimuthal angle </>' with respect 
to this rotating system is thus connected with <f> by the formula 

V = <f>-u>t, (212a) 

whence it follows that 



dx dfi 
d<f>’ dt 




(212b) 


Now the partial derivative with respect to t in equation (212) 
obviously refers to a constant value of <f>. Taking account of (212 b), we 
can therefore rewrite this equation in the form 



hu> d h fi'\ _ q 


(213) 


where d'x/dt denotes the value of the partial derivative with respect to 
t , taken for a constant value of <f>'. This equation can obviously be 
regarded as describing the motion of the electron with respect to the 
rotating coordinate system. 

Substituting in it the expression (211b). for S, we get, since 




h_V\ 

2rri dt] 


X = o. 


(213a) 


This equation reduces to that which describes the motion of the electron 
with respect to the fixed axes in the absence of the magnetic field— 
with the fixed axes replaced by the rotating ones—if the angular 
velocity <o is defined by 


w = 27tAv, 


(213b) 


i.e. if the frequency of revolution is just equal to Ar. 

This result is identical with that which is obtained with the help of 
classical mechanics, where it is interpreted as a 'precession of the electron's 
orbit about the direction of the magnetic field with the angular velocity 
w = 2itAv {Larmar's precession ). 
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The particular solutions of the equation 

( a+ il) x==0, 

corresponding to a conservative motion of the electron with respect to 
the rotating axes, are obviously the same as those of the equation 

( h d\ 

H+ — ~\ip = 0, with <j> replaced by <f>'. We thus have 

27 Tl ct] 

X = XlV = 

where H ' is a characteristic value of H i i.e. 

X H > = \ff Ir e~ inUxit — tpo^e-itoriH'+hwmlinyih' 

This is another expression of the result x°k' “ ‘Ah'* K'—H f = hkvm 
found by the preceding method. 

27. Relativistic Wave Mechanics as a Formal Generalization of 
Maxwell’s Electromagnetic Theory of Light 

Coming back to the relativistic theory of the motion of an electron in 
an external electromagnetic field, we have to face the following situa¬ 
tion. If the relativistic equation (196) established in § 25 is assumed to 
be correct, we must give up the theory of the preceding chapters, so far 
as the introduction of the energy operator H is concerned. If, on the 
other hand, we wish to preserve this theory and express the wave- 

( h 3\ 

H -1-. — | tft — 0, 

27Tt dtj 

we must replace the relativistic equation (196) by an equation or system 
of equations which are linear and not quadratic with respect to the 

operator p t = ± |. 

We shall now try the second alternative, not only because it fits in 
better with our previous ideas, but also because it is more general than 
the first alternative. In fact, the order of a differential equation can 
always be increased by repeated differentiation, so that, in particular, 
from an equation of the type (H+pJif/ = 0 we can always pass to an 
equation containing the square of p t . This can be done, for instance, 
by applying to the preceding equation the operator H+p t or H~p t 
giving (H 2 +2Hpt+p})ip — 0 in the first case and (H 2 —pf)ifj — 0 in the 
second. 

Of course we iuust be prepared to find that the equation of the second 
order (with regard to p t ), obtained in this way, will be somewhat 
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different from our original equation (196). Which one is chosen will 
ultimately be decided by comparing theory with experiment. 

It can easily be shown that a single equation of the first order with 
one unknown function \p, satisfying the space-time symmetry require¬ 
ments of the relativity theory and giving by repeated differentiation 
anything like equation (196) is a thing utterly impossible. It is, how¬ 
ever, possible to replace equation (196) by a system of several equations 
of the first order with as many unknown functions, which would satisfy 
the space-time symmetry condition and with the help of a second 
differentiation would assume a form similar to and, in the special case 
of free motion, identical with equation (196). We shall see, moreover, 
that this system of equations can be written in the form of a single 
equation of the type (H +p t )ift — 0, where H, p t , and </> are treated as 
four-dimensional matrices, or similarly, in one of the following three 
equivalent forms (P x —p x )^ — 0 , (P v —p v )*p = 0 , (P s —p z )i/j — 0 , where 
P x> P y , P z are matrix operators representing the components of the 
electron's momentum iri the same sense as H — P t represents its energy. 
The possibility of writing the equation of motion in these four equivalent 
forms is the direct expression of the equivalence of the space coordinates 
and the time, which forms the essence of the relativity theory. 

The first part of our problem, namely, the establishment of a system 
of first-order equations satisfying the space-time symmetry condition, 
can be solved in a very simple way, with the help of the analogy 
between mechanics and optics, which was the starting-point for the 
development of wave mechanics and which can still be used—with 
certain reservations—as a source of inspiration. 

Equation (201 a) 

(ul+ul+uz-uf+mlc 2 )^ = 0 

in the case of a particle with vanishing charge and rest-mass, reduces to 


c) 2 0 * 1 8 *\, 

c*dt*r ’ 


(214) 


i.e. to the equation of the propagation of light-waves (in empty space) 
with the true velocity c. If the wave velocity is equal to c, then the 
velocity of the associated particles must also be equal to c, so that these 
particles can be identified with photons. 

Now, according to the electromagnetic theory of light, equation 
(214), usually denoted as d’Alembert’s equation, does not give a com¬ 
plete description of the electromagnetic field of the light, waves. This field 
is specified by six quantities, namely, the three components E x> E u , E a 
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of the electric field and the three components! H x , H y , H z of the magnetic 
field, these quantities satisfying the well-known equations of Maxwell : 


and 


tiH z 

dy 

dH, 

tiz 

dHu 


dz 

dH z 

dx 

8H r 

dy 


1 8E X 

~c~U' 

A dE v 

C tit 
\tiE ; 
c tit 


tiE. 

tiy 

tiE. 


m s 

tiz ' C tit 

_8E_ M 18H„ 

tiz tix C tit 


tiE l tiH z 
tiy c tit 


bE v 

tix 


[ curl H - 


1 8E 

c tit 




To these six equations we may add the following two: 


r _ tiE r , tiE u , tiE „ A 

divE = *4- u 4- - --- 0, 

tix tiy tiz 


divH 


tix tiy tiz 


- 0. 


«). 


»)■ 


(215) 


(215a) 


( 210 ) 

(216a) 


The latter equations can, however, for vibrational processes, be regarded 
as a consequence of (215) and (215 a) respectively. Thus, if we dif¬ 
ferentiate equations (215) with regard to x , y , z , and add them, we get 

^divE = 0. From this it follows—in so far as we reject purely static 

fields—that div E^O. In the same way we can derive (216 a) from (215a). 

If we differentiate the left side of the first equation (215) with respect 
to the time t , we obtain, using (215 a), 

tiy c tit * tiz c tit u c 2 tit* 


(8E, 


1--I 

(8E Z 

_SE\ 

\ b y 

tix 

1 tiz\ 

• 

{ tix 

tiz 1 


_ ti*E x . ti*E x ti_itiE y tiE A _ 1 ti*E x 
tiy 2 tiz 2 £a*\ tiy tiz / c 2 tit 2 

i.e., by (216), 

VE t | &E , ( *E X 1 PE X = 

tix 2 tiy 2 tiz 2 c 2 


t The reader will easily distinguish between the symbol H in the combinations H T% II v , H z 
used here for the components of Hand the simple II used passi >w for the Hamiltonianenergy. 
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which is merely equation (214) with $ = E x . In the same way we obtain 
similar equations for the other five components of the electromagnetic 
field. We see, therefore, that d’Alembert’s equation must be regarded 
as the result of the elimination, with the help of a second differentia¬ 
tion, of the different field-components from Maxwell’s equations. 

This elimination is usually carried out with the help of the potentials 
A x> A y y A e , <f> which are introduced by means of the formulae 

E = -Vi--| A, H = curl A. 

C ut 


Thereby equations (215 a) and (216 a) turn themselves into identities, 
while equations (215) and (216), with the additional condition 
dA x dA y dA z 1 d<f> __ 
dx dy dz ‘ c dt 

yield four d’Alembert equations of the type (214) for the components 
of the potential. 

The preceding relation leads to a simplification of the wave equation 
(196), which assumes the following form: 
cfitjj d^ifj 1 d^ifj 

~e&~ 

_ 4 ™l A J± + A*t+A* + *%- (217) 

he \ tix~ v <ftj vz'etit] 

or, in vector notation, 


grad^ + ^V 
r c 2 tit 1 he \ r c tit) 


4 ?r 2 e 2 

'W 


(217a) 


This equation, written in the form (201a), can be regarded as the 
simplest generalization of d’Alembert’s equation (214) for material 
particles (electrons) with a non-vanishing charge e and rest-mass — 
a generalization obtained by replacing the operators 

JlJL Ai JlL Hi 

27ri dx’ 2rri dy’ 2ni dz’ 2iri c dt* 


by the operators u x = ~ ^— ~A X , etc., and further by adding to the 
2tt% dx c 

left side of (214) the term j m 0 c 2 t/j. 
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Now Maxwell’s equations form a system of equations of the first 
order satisfying the space-time symmetry condition and implying 
d’Alembert’s equation as a corollary. We are thus naturally led to 
the conclusion that the first-order equations of the relativistic wave 
mechanics, which must replace the second-order equation (201a), can 
be obtained as a generalization of Maxwell’s equations, in a way similar 
to that which leads from d’Alembert’s equation (214) to the wave- 
mechanical equation (201 a). 

We shall assume, therefore, that the electron (or proton) waves can 
be described not, as so far assumed, by a scalar quantity 0 but by two 
vector quantities M and N which are analogous to the magnetic and 
electric field strength (H and E) respectively, and we shall seek to 
generalize Maxwell’s equations by introducing the operators u x instead 

of etc. The second part of this generalization, i.e. the introduc¬ 

tion of the rest-mass, wc shall at first disregard, i.e. we shall put m 0 = 0. 

To begin with, we must notice that the generalized operators 
unlike the original, are non-commutative , i.e. we obtain different results 
if we apply to any function 0 two such operators in a different order. 
For example, if we form the difference of the expressions u x u y \fj and 
u u u x 0 we obtain 


/ h \ 2 d 2 ib h B (e . A e A h dib e 2 . A .' 
,\ 2 «) ^-^ito\c Av *)-i AT 2mty + c* AlAv,lj . ' 


I 

h_ 8_l 

dydx 

2ni dy\ 


i.e. 


0 u x v v -u u u x )4’< 


fte /SA X 

2ttic\ dy 


a A 

dx 




or, by (15)9 b), 


he 

2iric 




(218) 


if we omit the factor 0 operated upon. In a similar way we get the 
formulae 


he 




and also 


2tt%c 
u x u-u t u x 


u z u x —u x u z 


he w 
2iric x 


he „ 
' 2 ™ v ’ 


(218a) 


and two analogous formulae for the combinations (y,t) and (z,f). 

Because the operators u are not commutative, their introduction into 
the eight Maxwell equations [multiplied by h/(2m)] in place of the 
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hr B 

operators -— —, etc., necessitates a further modification. We must, 
2m Bx 

namely, add to the right side of these equations extra terms of the 
form uM 0 or uN 0 where M 0 and N 0 are two new scalars; otherwise 
(i.e. when M 0 = — 0) the eight equations obtained for the six 

quantities M x , M y , M z , N x , N y , N z would be, in general , incompatible 
with one another. In fact, if we limited ourselves to a replacement of 

the operators —,... by u x ,..., the equations obtained from (216) and 
2i ti Bx 

(216 a) would no longer be a corollary of the equations obtained from 
(215) and (215 a) and would therefore contradict the latter. 

In writing down the generalized ‘Maxwell-like’ equations, the fol¬ 
lowing circumstances should be noticed: 

(1) The extra terms uM 0 and uN 0 on the right side must represent 
the space-time components of two four-dimensional vectors analogous 
respectively to the vector of electric current and charge density in the 
case of equations (215) and (216)—which will be referred to as the 
I group of Maxwell’s equations—and to the vector of ‘magnetic current 
and charge density’ in the case of the II group, formed by equations 
(215a) and (216a). 

Treating Jf 0 and N 0 as scalar quantities, we can define the com¬ 
ponents of the first vector by u x M 0 , u y M 0 , u z M 0 , ±u t M 0 , and that of 
the second by u x N 0 , u y N 0 , u z N 0 , ±u t N 0 . 

(2) The ambiguity of sign (±) arising in this connexion can be removed 
with the help of the fact that the two groups of Maxwell’s equations 
can be derived from each other if E is replaced by H and H by — E. 

We must therefore require that one of the two groups of the general¬ 
ized Maxwell-like equations be obtained from the other by replacing 
N* Ny> N z) N 0 by M x , M y , M zi M 0 and M x , M yi M z) M 0 by — JV*, -N yt 
~N Z , —N 0 . Taking this into consideration, we obtain, as the first step 
in our generalization of Maxwell’s equations, leaving the rest-mass out 
of account, the following system of equations: 

u y M z —u z M v —u t N x = u x M 0 ’ 
u z M x —u x M z —u t N y =u y M 0 (219) 

u x M y —u y M x —u t N z = u z M 0 j 

u v N z —u z N y +u t M x = u x N 0 \ 
u z N x -u x N z +u t M y = u y N 0 J 
u x N y —u y N x +UtM z == u g N 0 j 


(219a) 
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u x N x +u y N v +u z N z = —u,M 0 (220) 

u x M x +u t/ M v -\-u z M : ^ +u,N 0 . (220a) 

From these equations we will now by ‘generalized differentiation’, i.e. 
by repeated application of the operators u , obtain eight differential 
equations of the second order which correspond to d’Alembert’s dif¬ 
ferential equation. 

If we apply the operators u x , v t u z to the equations (219) and the 
operator u t to equation (220), we obtain by addition, using (218) and 
(218a): 

- J ie MK+H, M„+TL M-E,N x —E u N y —E.K\ 

JmdTTl/C 

= (u';.+u%+uz—ui)M„, 

or, if we put for shortness 

A — ul+u°—Vf (221) 

and use the vector notation: 

0.-Mo“^*(-H-M + E-N). (221 

Z7TIC 

Similarly we get from (219a) and (220a) the equation 

A A - ^(-H'N-E-M). (221 b) 

2 me 


With e = 0 these equations can be satisfied identically if we put 
M 0 — N 0 =~ 0. In the general case, however, the scalar functions M 0 
and A r 0 must be different from zero. 

If we apply the operator n t to the first equation (219) and interchange 
the order of the different operators u, we get, taking account of (218 a), 

^ ic (E, M-E z M v -E x M 0 )+u u u, M-v z u t M u — 

—w x v t M 0 —-u'i N x = 0 . 


Now by (219 a) and (220): 


u v u,M c = u y u z N 0 —u v u x N v +ulN x , 

—u z u,M u — — u, u y N 0 + u\ N x -u z v x N ,, 

—u x u,M 0 = v x N x u x v y N v +u x u z N z . 

By repeated application of the relations (218) and (218 a), we thus 
obtain 


(ul+ul+ul-tf)N x + 
he 


+ ^%c^Ey M ~ E z M v -~ E x M 0 +H v N a ~H s N y -H x N 0 ) 


0. 


This equation and the two others which result from it by cyclic 

3M5.6 M m 
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interchange of the indices x, y , z can be summarized in the following 
vector equation: 

A.N+j 1 '* [(ExM—EJ/ 0 )+(HxN-H2V 0 )] - 0. (222) 

Z7TIC 

Similarly, by application of the same method to equations (219 a), we 
obtain the second vector equation, 

z> 0 M+--* [(HxM-Hlf 0 )— (E x N — EN 0 )] = 0. (222a) 

2ttIC 

Equations (221 a), (221 b), (222), and (222 a) are the required generaliza¬ 
tion of d’Alembert’s equation. They differ, however, from the latter, 
not only by the differential operators u appearing in D 0 instead of 
h d 

etc., but also by additional terms which are proportional to the 

Z7Tl ox 

electromagnetic field components and which for each equation have 
a special form. 

If we omit these additional terms (whose physical meaning will be 
explained later) we obtain, for all the eight functions N z , M 0> N 0f 

identical equations of the d’Alembert type—equations which differ 
from the relativity wave equation (201a) or (217 a) found earlier only 
by the absence of the ‘mass term’ mgc 2 in the operator 2> 0 . This shows 
that the second step of our generalization of Maxwell’s equations—in 
so far as it is a question of the resulting generalized d’Alembert equa¬ 
tions—must consist in replacing the operator D 0 by the operator intro¬ 
duced earlier, namely, 

D = D 0 +n$*. (223) 

The corresponding introduction of the parameter m 0 c into the equa¬ 
tions of the first order (219) to (220) is done most simply as follows: 
In equations (219) and (219 a), which contain the time derivatives of 
the quantities N x , N yi N z , N 0 , we replace the operator u t by 

u\ = u t —m Q c, (223 a) 

and in equations (219 a) and (220) by 

Ut = Ut+m 0 c. (223 b) 

Taking into account the relation u t u t — u t u\ ~ uj—ml c 2 , we can easily 
convince ourselves that from these generalized Maxwell’s equations 
u y M t —u t M y —u' t N x = u x M 0 ’ 
u t M x —u x M g — u\ N y — u v M q 
u x M v —u y M x —vi N a = u z M 0 i 


( 224 ) 
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u„N t —u z N v +u t M x = u x N 0 
u z N x —u x N z +u',M y : 
u x N v -u y N x +u t M z = 

U x N x Uy Ny “1“ U z N z Wj Mq 

Uj. M X ~\~Uy My -f" U z M z “ “|“ A0, 

there follow the generalized d’Alembert’s equations: 

DM 0 + —(H'M-EN) = 0 ' 

0 ' 2mc ' 

hp 

DN 0 + -^-(H-N+E-M) - 0 

2 nlC 
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(224 a) 

(225) 
(225 a) 


(226) 


DM + ^-[(HxM-HAf 0 )-(ExN-EA r 0 )] = 0 
/>N + -~ [(ExM—EJtf 0 )+(HxN—HiV # )] = 0 

llTlC 


(226 a) 


Equations (226) and (220 a) become identical if we put either 
N - i M, = iM 0 

or N = —<M, Nq — —iM 0 . 

Thereby they assume the following simple form: 

he 


DM + j-. [(HT*E)xM-(HT*E)3f 0 ] - 0. 
2i tic 


(227) 
(227 a) 

(228) 
(228 a) 


Let that solution of these equations which corresponds to the upper 
sign be denoted by M+ and the other by 3f~. The general solution of 
equations (226) and (226a), therefore, can obviously be written in the 

form M = CjM + +c 2 M _ , M 0 = Cl M;+c 2 M~ | 

N = t(c 1 M+—CjM"), N 0 = )’ 

where c x and c 2 are two arbitrary constants (which must be introduced 
if the solutions M + and if- are normalized in some way). 

It must be mentioned, however, that the first-order equations (224)- 
(225 a) do not admit solutions of the type (227) and (227 a), because of 
the appearance of the two different operators u' t and u t . These solutions 
do not have, therefore, any real significance. 
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28. Alternative Form of the Wave Equations; Duplicity and 
Quadruplicity Phenomenon 

There is another possibility of halving the number both of the second - 
order equations (226)-(226a) and of the first-order equations (224)- 
(225a), as well as of the wave functions M , iY, defined by them. 

We must notice, first, that equations (224)~(225a) can be naturally 
regrouped by associating (225a) not with (224a), as has been done 
before, but with (224), and (225) with (224 a). The two groups of four 
equations thus formed will be denoted by V and II" respectively. 

It is now easily seen that the equations of each group can be com¬ 
pounded in pairs and, as it were, folded up together, in such a way as 
to form two groups of two equations involving four unknown wave 
functions. Taking the group 1' we can, for example, compound the 
first two equations (224) to form one pair and the third with equation 
(225a) to form the second pair. If we multiply the first equation of 
the second pair by i and add it to the other, we get 


(u x —iu y )M x + (m x -f u y )M y +u t (M z — iM 0 )+ «,'(— iN z —N 0 ) 

= (u x —iu y )(M x +iM y )+u z (M z —iM 0 )+u' t (—iN s —N 0 ) = 0. 

Likewise we obtain, by subtracting the second equation (224) from the 
first equation multiplied by i, 


(u x +iu v )M t +u t (—M x ~iM y )+u' l (—iN x +N li )—(iu x —u y )M 0 

= (u x +iu y )(M z —iM 9 )—u z (M x +iM y )+u' t (—iN x +N y ) = 0. 

If we put, therefore, 


(229) 


<Pi = M x +iM y , <Aa = M z —iM 0 
Vs = —*(N x +iN y ), </q = —i(N M —iN 0 ) 

we can reduce the four equations under consideration to the following 

tV, °' = 0 1 (229 a.) 

(u x +iu y )<l> 2 —u z h+ulip 3 = 0 )’ 1 


In a similar way the four equations of the group II", (224 a) and 
(225), can be folded up into the two equations 


(tv- iUyMt+u, <^+14^1 = 0 ) (229b) 

(«,+twyV*— i = ° y 

with the same four unknown wave functions (229). The equations 
(229 a, b) were first derived by Dirac. 

The process just described can be applied to the second-order equa¬ 
tions which are obtained from (226) and (226 a) by taking their com- 
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ponents along the coordinate axes. We have, for instance, according 
to the first equation (226 a), 

D(M x +iM u )+ 

+ :K {{(H a IL M„-H x M 0 ) + i(H : M x ~ H x M z -H u J/ 0 )l - 

ZlTIG 

-[(E y K-E z N„-E x N 0 )+i(E z N x -E x N z -E v N 0 )]} - 0, 

that is, 

T)(M x +iM u ) + - 

--[iE z (N x +iA\)-i(E x +iE u )(N-iN 0 )}} = 0; 

and similarly, 

^. c {[(H x M u ~H y M x ~H z M 0 )-i(H c M x -rH y M u +H : M : )} - 
-[(E x N,-E t Nt-E e N 9 )-Hh: x N x +E l ,N y +E,Nj\} - 0, 

that is, 

ZTT'lL 

- [ - i(E x - iE„)(N x + iN„)~iE,(N,— ,W 0 )]} = 0, 
or, according to (229), 

»<!>> + = o \ 

J >h +~ {- + it( e x -;a>a 3 + e,•!>*]} = o j 

(230) 

In the same way the four remaining equations (22G)-(226a) are folded 
up into 

^ 3+ |i{-^^ 1 _ ( ^ + t ^ W , 2 ] + [^^_ ( ^ + ^ ) ^]} = 0 ' 

^ + ^+^ E -~ iE y^ +^ 'p2\-[(H x -iH v )4, 3 +H : ^ 4 ]} = 0 _ 

(230 a) 

They can be derived from (230) if «/q and \jj 2 are replaced by 0 3 and ^ 4 , 
and the latter by i/q and tp 2 . Both the equations (230) and (230 a) can 
be obtained, of course, directly from equations (229 a, b) in the same 
way as the equations (22G)-(226 a) are obtained from (224)-(225a), 
i.e. by the application of the operators u to the left side of (229a, b). 
The latter equations were established by Dirac in an externally different 
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form and by a different method, which will be indicated later and which 
does not make use of the formal analogy between wave mechanics and 
the electromagnetic theory of light. We shall see that this analogy is 
actually not so deep as it seems at first sight, and that the regrouping 
of the equations (224)-(225a), which is necessary for their folding up 
into the Dirac equations, is a formal expression of a drastic divergence 
between the wave-mechanical functions M t N and the electromagnetic 
functions //, E. 

It is interesting to notice that a similar regrouping and folding up 
can be carried out with regard to Maxwell’s equations. These ‘dis¬ 
guised’ Maxwell’s equations can be obtained from equations (229 a) 
and (229 b) by putting e = m Q — 0, and further by replacing the vectors 
M and N in the definition (229) of the functions tft by H and E, 
dropping the terms M 0 , N 0 . 

In fact, it can be directly verified that if we put 

H s +iH u =. ^ **==*. ) (03 

-i(E x +iE u ) = -iE s - * 4 . I 

we obtain, instead of the eight equations (215)—(21(> a), the following 
four equations: 



Another well-known possibility of reducing the eight Maxwell equations 
to four consists in combining the electric and the magnetic field 
strengths to form a complex vector 

K - H±iE. 


We then obtain, instead of (215)-(21Ga), four equations of a similar 
type, namely, 


curl K ± - — K = 0, 
c ot 


divK = 0. 


This method is not applicable to the generalized Maxwell equations 
(224)~(225a). 

The formulae (230) and (230 a) correspond to the union of the 
variables x and y as well as of the corresponding components of various 
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real vectors to form complex quantities w = x+iy , H x +iH y — i/q, 
^+^ 1 / — ^ 3 ) etc. The operators d/dx—id/dy and d/dx+idjdy can 
thereby be regarded as the differential operators d/dw and djdw* 
corresponding to the complex variable w and the complex conjugate 
variable w* = x—iy respectively. 

While we can regard the formulae (230) as a decomposition of the 
complex functions 0 4 into real and imaginary parts, this is not 

so in the case of the analogous formulae (229). The fact that all the 
eight quantities M , N must in general assume complex values follows 
immediately from the complex nature of the operators u in the equa¬ 
tions (224)-(225a) determining them. 

The reduction of these eight equations to the four equations (229 a, b) 
is, therefore, an actual halving of the number of unknowns, while in the 
case of the Maxwell equations we have simply a union of real quantities 
—as the components of the electromagnetic field are—to form complex 
quantities. 

If the four complex quantities «/q actually suffice for the com¬ 

plete determination of the electron waves it must be possible by means 
of these functions to express the statistical quantities, i.e. the pro¬ 
bability density p and the components of the probability current density 
j x , j y , j z which we have determined earlier by means of the scalar ip. 
In the new determination of these quantities we shall at first be guided 
by the same analogy as that which led us to the generalized Maxwell 
equations—or to the Dirac equations equivalent to them. From this 
point of view the quantities p and j must correspond respectively to 
the electromagnetic energy density 

p = ±( e *+ h *) 

07 T 

and the energy-current density (i.e. to Poynting’s vector) 

J = ^ExH. 

47r 

If we put here, instead of the components of E and H, their expres¬ 
sions obtained from (231): 

H x = Kfc+tf), H y = !(*,-#), = 

E x = | (*,-*?), E v = *(&+«), E, = uji i = -pR, 


we obtain 
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and further, 

J x *= ±-(EJL-E z H u ) = 

= -£~ (03 0?+0? */'•>+02'0i+04 0* )> 


and similar formulae for J y and J z . 

These quadratic expressions arc clearly real and also remain real 
when all the four quantities i/q,..., ip i are complex. We are led, there¬ 
fore, to use them for the representation of the quantities p and j. 
Omitting the common factor 1 / 877 , we obtain 

P = + +04 04 ( 23iI ) 

jx ="■ <■ (<Al 0f -I- 04 01* + 02 0J +03 0? ) ~) 

-ic(0 1 0 4 *-040f+0 2 0r-030?) • (23-’ ft) 

is— c (0i 0*+03 0* + 02 0*+ 04 0?) 1 

If these expressions are correct they must, like the expressions obtained 
earlier for p andj, satisfy the equation 


c^:r dy d z 


= 0 , 


(232 b) 


expressing the law of the conservation of probability (or of the number 
of copies). It can easily be shown by means of equations (229 a, b) that 
this is indeed the case. 

Multiplying these equations successively by 0*, sub¬ 

tracting from them the corresponding conjugate equations 

(u* + i tt*)0f + «*^?+^V? = 0 , 

etc., multiplied by \jj v etc., and finally adding the results, we get: 

+[0?(« J -*“ w )0s-0s«-*O0?]4 [0f(«x+*«»¥*-0«(«J+»O0f]+ 

+ (0X02—02 w ?0?)~ (0 s«*0j—0i w ? 0*) + (0?«204—04 w *0?)“ 

-(0N,03-0 3 <0?)+(0>;*04-04«70r)+(0>;03 - 03«'r0?)+ 

+ ( 0 ?«/' 02 — 0 * M i’* 0 ?)+( 0 f w « 0 i— 01 M <’* 0 *) = 

which, by the definition of the operators w, easily reduces to (232 b) with 
the expressions (232) and (232 a) for /> and ^ j £ . 

Formula (232) is the immediate generalization of the formula /> = 
of the original non-relativistic SchrOdinger theory. On the other hand, 
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the expressions (232 a) have a form entirely different from the original 
expressions for the current density 

j x h 

47Tm 0 i\ dx dx J 


A more accurate investigation of equations (229a, b) shows, however, 
that this difference is not so great as it seems. With harmonically 
vibrating waves, corresponding to a motion with a definite energy e, 
the dependence of the functions <// 4 on the time is described by the 
common factor e~ i27T€tlu , so that the operators u' t and u\ reduce to the 
ordinary factors 


u't — — I(c— U~\-m 0 c 2 ) = — -(W— U-\-2w 0 c 2 ) 
c c 

uj = —-(e—U-m 0 c") = - l -(W-V) 

c C J 


(233) 


where U ~ e<f> is the potential energy of the electron and W—U is its 
kinetic energy. In general (so far as we restrict ourselves to positive 
values of e, see below), the first factor is enormously large compared 
with the second; therefore the functions t/» 3 and i/q which are multiplied 
by it in equations (229a) must, with regard to their absolute magni¬ 
tude, be very small compared with the functions j/q and «/r 2 . If, more¬ 
over, we restrict ourselves to the case of motion with a kinetic energy 
W—U, which is small compared with the rest-energy ?n 0 c 2 , i.e. with 
a velocity v whose square is small compared with c 2 , we can put 
approximately, according to (229 a), 

■2m Q c4 3 = | (233a) 

2m 0 ci/> 4 = (u x —iu v )i/) 1 +u t <l> 2 I 

Since these relations no longer contain the energy <?, they may be 
regarded as approximately valid in the general case of non-conservative 
motion. 

It should be mentioned that, according to (233 a), the ratio of the 
functions «/r 3 , to the functions ifi v tf/ 2 is of the order of magnitude 
gl(m 0 c) ~ v/c, where v is the velocity of the electron, and g is its proper 
momentum estimated roughly by the ratio 1 , 2 • If follows from 

this that, to the first approximation with regard to small quantities of 
the order v/c, we can put, instead of (232), 

P - ( 234 ) 

neglecting the squares of i/r 3 and ^ 4 . Substituting the expressions 

N n 
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(233 a) in (232 a), we get further, 


= 2~ {[W“* ’Pi+'Pl'Ux W+(01 “J'Af+'Aa *£#?)]+ 

+»MS w - (0a <*?)]+ 

+[(#«, lAa ’l'* u z *,)+(*, «?#-*. <</>*)]}. 

If we put = j4 2 = ^4 3 = 0 (i.e. if we neglect the potential momentum 
—if any—compared with the proper momentum), we obtain the fol¬ 
lowing formula: 


^ - in^ilr to + te ~* 1 dx * a£ j + 


+ *^(^2—+ J-jMty* — 020l)J • 


the first term of which (in square brackets) is the same generalization 
of the original expression 


- A._Y^* .£. ^^ A 

7rTn 0 i\ &r r T dx T ) 


as (234) is of the original expression for p. The physical meaning 
of the two additional terms will be cleared up in the next section. From 
the purely formal point of view, these two terms, as well as the corre¬ 
sponding terms in j y and j z) can be regarded as the xy-, and z-com- 
ponents of the curl of a certain vector c9N, defined by the formulae 

*. -1 ~ «, = I~ ) 


. (0?03—0f0i) 


(234 a) 


so that the approximate expression for the current density in vector 
form is: 

j = i (tfV* 1 +#V* i -* 1 Vtf-kV#)+ecurlW. (234b) 

47 Tl7l 0 l 

If, further, we substitute the approximate expressions (233 a) in 
equations (229b), the latter assume the following form: 
(u x -i'u u )(u x +iu v )<p i -(u x -iu u )u z 0,+ u z (u x -iu, l )<fi 1 + 

-)-Mji/i 2 +2m 0 CM'i/( 2 = 0, 

(u x +iv u )(u x - iu u )>p 1 + (u x +iu u )u. i/; 2 -u,(u x +iu v )^+ 

+' m I i / , i+ 2w! o cw (Vi = 0- 
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Now according to the relations (218) and (218 a), we have 
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(u x -iu y )(u x +iu y ) = ul+ul+HUjUy-UyUj = u~+ul~-—H z , 

JiTTC 

; = K+K+2- c H z' 

he 

u.(u J .-iu ll )-(u x -iu l/ )u- = —i-Hj+iH,,), 

Z7TC 


he 

( u x 4 - * u „ U x +«'«„) = — 


Wc have further 


2m 0 cUf -- 2m 


UjL't 

0 [\27ri dt 


+ e^|+m 0 c 2 


which reduces to —2m 0 (\Y — U) for conservative motion. We can drop 

Jl 3 

the constant term m 0 c 2 if — is assumed to reduce in the latter case 
2tti dt 


to — W and not to —e (this constant term entails an irrelevant factor 
e~iiirm^tlh j n the expression of the functions tf/ v i/r 2 ). With this condition, 
the preceding equations can be written down in the following form: 

(i u,+1, ‘ +r, )* + 4^1 = « | 

>, (23o) 

(if + ‘“ +v h + - 0 j 

where u is the (three-dimensional) vector with the components u x , u y , u z . 

These equations represent the approximate form of the relativistic 
second-order equations (230) and can indeed be obtained from them 
by dropping the small quantities </f 3 , </q. The approximation involved 
corresponds to neglecting terms of the second and higher orders in v/c, 
including those which represent the variation of mass with velocity. 
It must be mentioned that, although the functions </f 3 , 0 4 are themselves 
small of the first order with regard to i p v t/r 2 , they are multiplied, in 
equations (230), by the factor hejZnc, which can be regarded as a small 
quantity of the first order (in 1 jc). 

If, in equations (235), we drop the additional terms, proportional to 
the magnetic intensity (putting either H = 0 or c = oo), they reduce 
to equation (203 a), § 26, the two functions «/q and </f a becoming iden¬ 
tical with the single function 0 of the previous theory. Equations 
(235) thus give a more complete description of the motion than equation 
(203 a). In fact they exhibit the duplicity phenomenon which has already 
been indicated in Part I, § 19, and traced to the electron’s ‘ spin * or 
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*intrinsic magnetic moment \ To these properties correspond additional 
forces, which are represented by the additional terms, proportional to 
the magnetic field in equation (235), and also to the electric field in the 
exact equations (230). 

The duplicity phenomenon, as explained in Part I, in its simplest 
form consists in the splitting-up of each quantized state, as determined 
by Bohr’s theory, into two states which in general have slightly 
different energies. So far as the number of states is concerned, Bohr’s 
theory gives the same results as the ordinary Schrodinger equation with 
one wave function ip. Now to each solution of this equation, ift H > say, 
there corresponds a set of two solutions of the system of equations (235) 
or rather of the equations obtained from them, if the operator —p t is 
replaced by the energy constant. 

This means that to each energy-value H ' of the ordinary Schrodinger 
equation there correspond two slightly different energy values, H\ and 
H'_ say, of the system of equations (235). Each of these energy values 
is associated with a set of two functions i/f 1/r+ , tp 2H and 0 2 //'-5 

these four functions replace the single function ip u > of the Schrodinger 
theory. 

If, instead of the approximate equations (235), we take the system 
of four exact equations (230) and (230a), then by a similar argument 
it seems to follow that to each state of the ordinary Schrodinger theory 
there corresponds, according to the exact theory, four states, whose 
energies, if the magnetic and electric field strengths are not too large, 
lie close to the energy H' of the single Schrodinger state. 

This conclusion is, however, fallacious, for the four second-order 
equations (230)-(230a) are not independent of each other, being in fact 
derived from the four first-order equations (229a)-(229b). So far as 
the number of solutions (i.e. states) is concerned, the latter are equi¬ 
valent to two of the four second-order equations derived from them. 
We get, therefore, with the exact equations (230)~(230a), a duplicity 
phenomenon of the same type as with the approximate equations (235), 
the value of the energy being, of course, somewhat different in the 
exact theory from what it is in the approximate theory. 

The exact theory, when compared with the approximate theory or 
with the original non-relativiatic Schrodinger theory, leads, however, 
to an additional duplicity phenomenon of an entirely different type, 
which is not connected with the ‘spin’ property, but can be referred 
to as due to the variation of the mass with velocity. This type of 
duplicity is already implied in the relativistic equation with the single 
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function «/r, which was derived at the beginning of this chapter. We 
come upon it in its simplest form in the case of free motion, when the 
operators u x ,u y) u z , u t can be replaced by ordinary numbers (multipliers) 
9x> 9y> 9z> € l c representing respectively the components of the momentum 
and the energy, including the rest-energy m 0 c 2 , divided by c. Equations 
(230) reduce in this case to the same form as equation (106), namely, 

which is equivalent to the ordinary relativistic relation between momen¬ 
tum and energy 2 

9*-~ + < C 2 - 0. 

Now since this relation contains the square of the energy, it leads to 
two numerically equal values of the latter, one positive and the other 

neg ati ve ’ e = icVKcH?’). 

In Einstein's mechanics, the negative value was rejected as having 
no physical meaning. It has, however, been explained already in Part I, 
§19, that this rejection is not justified in wave mechanics, because of 
the possibility of a continuous transition from a state of positive to 
that of negative energy c through imaginary values of the velocity or 
because of a ‘jump’ produced bv some perturbing forces. 

In the case of non-relativistic wave mechanics, we have, under the 
same conditions (free motion), 

g 2 —2m 0 W = 0, 

where }V is the ordinary (kinetic) energy, not including the rest-energy 
?v 0 c 2 . This non-relativistic energy is related to the positive energy e of 
relativity mechanics by the equation 

W — €—m 0 c 2 , 

whereas the negative energy e has no counterpart in non-relativity 
mechanics. The appearance of the negative energy e in addition to the 
positive energy forms the essence of the duplicity phenomenon of the 
second kind. The situation is not substantially changed in the general 
case of motion in a conservative field of force, the only difference being 
that the positive and negative energies of the corresponding states are 
not numerically equal. 

Combining the two duplicity phenomena—that due to the spin and that 
due to the relativistic variation of the mass—we get a quadrupling pheno¬ 
menon which can conveniently (though not quite correctly) be associ¬ 
ated with the replacement of the single i/r-function of the Sehrodinger 
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theory by the four </r-f unctions of Dirac’s theory.—This association 
is not quite correct, for the same quadruplicity phenomenon would 
result from Pauli’s theory, based on the use of two functions \p x and \p 2 , 
if, in the approximate equations (235) defining them, the non-relativistic 
operator u 2 /2m 0 -\-p t ~{- U were replaced by the corresponding relativistic 
operator of the second order, D = (it 2 — ^ 2 -fm5C 2 )/2m 0 . It must be 
mentioned, however, that in doing this we should be guilty of incon¬ 
sistency, because, having dropped additional terms of the second order 
proportional to the electric field strength in deriving the approximate 
equations (235) from the exact equations (230), we must also drop 
second- and higher-order terms, representing the dependence of mass 
upon velocity, in the main operator D. 

In the case of free motion (represented by plane waves), there exists 
a very simple relation between the four functions \js referring to the 
positive energy and the corresponding negative energy solution of the 
Dirac equations (299a)-(299b). Putting 

\jj k = a k e ii1T(ff * x t g » v+a * s - tm , (236) 


where the a k are constants (k — 1,2,3, 4), we can replace them by the 
following algebraic system: 


(^“^y)«i+^«2 + ^(^+ m o c2 K 0 

tex+ i 9 ¥ )<*t—9z a i + ^(c+m 0 c 2 )a 3 - 0 
(9x- i O i/ ) a z+Uz a i + l^-^ l o c2 ) a 2 = 0 

(Ox + { 0y K - f/c «3 + C 2 )« 1 = 0 


(236 a) 


(236 b) 


If, in these equations, the energy e is replaced by —c, then the first 
two become identical with the second two and the latter with the 
former if simultaneously a v a 2 are replaced by r/ 3 , a 4 and cr 3 , a 4 b}' 
—a v — a 2 . This means that, with 

01 = 01 > 02 02. 03 = 03J 04 ^ 0;> 

corresponding to e = e' > 0, we have 

' 01 = 03 . 02 = 04. 03 = “0;. 04 ~ 02’ 

for € ~ 

It has been assumed, hitherto, that the functions ^r 3 , \fs 4 were small 
(of the first order in v/c) compared with 0 lf 0 2 . We now see that this 
is only true if we restrict ourselves to positive energy solutions; the 
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converse is true in the case of negative energy solutions—both for free 
motion and for a motion in a coriservative field of force. 

From the point of view of the old relativity mechanics, the reversal 
of the sign of the energy e == chnj^l — v 2 /c 2 ) is equivalent to the reversal 
of the sign of the rest-mass ra 0 . This is not exactly true, however, in the 
wave-mechanical theory. For a reversal of the sign of m 0 in equations 
(236a)-(236 b) leads to the replacement of </q,0 2 by 0 3 , */r 4 and 0 3 ,0 4 by 
ip 1 ,0 2 without reversal of the sign of the latter. The two solutions have 
nothing to do with each other, since they refer to particles of different 
kinds (particles with negative rest-mass being in reality non-existent), 
whereas the two solutions corresponding to e = ±e' refer to the same 
particle with a positive rest-mass ra 0 , the values of the energy being 
due to the ambiguity of sign in the radical of the expression 
€ = c 2 m 0 j^(l~v 2 jc 2 ). 

It is important to notice that the states of negative energy, as deter¬ 
mined by relativity wave mechanics, are not directly observable. Accord¬ 
ing to Dirac’s theory of the duality of matter and electricity, outlined 
in Part I, § 19, nearly all these states are occupied by electrons, the 
vacant states (‘holes’) being observed as protons. According to the 
revised version of this theory, the holes in question represent not 
protons but positive electrons, which have been recently discovered by 
Anderson in America (1932) and by Blackett and Occhialini in 
England (1933). 

29. The Approximate Pauli Theory in the Two-dimensional 

Matrix Form; Electron’s Magnetic Moment and Angular 
Momentum 

The approximate (non-relativistic) equations (235) were initially ob¬ 
tained by W. Pauli in 1927, not as an approximation to the Dirac 
theory, which was published a year later, but as the result of a semi- 
empirical attempt to interpret wave-mechanically the duplicity pheno¬ 
menon, which a year before had been incorporated by Uhlenbeck and 
Goudsmit into the Bohr theory on the assumption that the electron 
possesses a spin motion, with an angular momentum equal to half of 
the Bohr unit hftrr and a magnetic moment equal to Bohr’s magneton 
/x = eh/(47rm Q c). 

Pauli’s equations (235) can actually be put in a form corresponding 
to this assumption, i.e. giving a wave-mechanical interpretation of the 
electron’s ‘spin’, and, indeed, by using a matrix notation, based upon 
the representation of the two functions 0 1? 0 2 as the elements of a one- 
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column matrix 

0 


(0i 

W f 


§29 

(237) 


(238 a) 


the conjugate complex functions forming the adjoint one-row 

matl ' iX r - {&.*.}• (237 a) 

Under this condition, the two equations (235) can bo written in the form 

P0 = 0, (238) 

where P is a square 'operator-matrix’ of the second rank 

with suitably defined elements. These elements must be defined in such 
a way that the two equations (235) assume the form 

W)i 

(m* 

Hence it follows that 

where 

is the unit matrix of the second rank and a is a vector matrix with 
the following rectangular components: 

/ i n\ 

(239 b) 

The scalar product H o denotes, as usual, the sum H x o jr -\~H u o u ^ r ILa z . 
This is a matrix with the elements 

(H-o)„ = -H z , (Ho) 12 = H t +iH, \ 

(H'») a - H x - iH y , (H-o) m = +H S I - 

The matrix a was introduced by Pauli for the wave-mechanical repre¬ 
sentation of the electron’s magnetic moment which was supposed to be 
due to its spin. This ‘intrinsic’ magnetic moment can be defined as the 
operator or matrix 

r 1 1 = fio, 

where = eA/(47rm 0 c) is the value of the Bohr magneton. 

The reason for this is that equation (238) can be written in the 
usual form 


^31 01 U2 02 ““ ^ \ 

Al 01 + ^22 02 b j 

(238 b) 

(u 2 +P/+ U )&—/xH a, 

(239) 

s =(:?)■ 

(239 a) 


1° T 

O,, — 1 

0 *1. 

1-1 0| 

O', ~ ( 

or 

V 1 

— i 0) 

- | 0 +1/ 


(239 c) 


(K+p,)</> = 0 
if p, is defined as the matrix-operator 

s h S 
Pl 2 ni bt’ 


(240) 


(240 a) 
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and K as the energy matrix-operator 


A' = / — u 2 +t/\S-u H. (240b) 

\2 m 0 j 

the additional term —-ja-H having exactly the same form as the energy 
of an elementary magnet with a moment p. in the given external 
magnetic field H. 

We thus sec that the generalization of the Schrodinger theor}^ which 
is necessary to account for the spin phenomenon consists in adding to 
the energy operator the extra term —pH and in replacing ordinary 
operators by operator-matrices of the second rank, the function tp being 
replaced accordingly by the one-column matrix (237). The old operators 

of the Schrodinger theory, such as u 2 -f- U and are replaced 

by their products with the unit matrix of the second rank 8. 

In future we shall usually omit the unit matrix, its presence as a 
factor being understood whenever we have to deal with an ordinary 
operator—like u 2 or U, etc.—of the old theory. With this convention, 
the old theory can be preserved without any change of form whatsoever 
—except for the addition of the extra term — pH to the energy operator 
and the corresponding modification of other expressions connected with 
the resulting operator K. 

Thus, for instance, if the characteristic values of K , which will be 
denoted by K\ K", etc., as before, are imagined to be multiplied by 
the unit matrix 8, we may write, omitting the latter, in the same way 
as in the old theory: (K-K'» k . = 0, (241) 

which is actually equivalent to the system of equations 


(A'n—A" +iii 12 j/f^2 — 0 

^21 ^K'l 22 K )<^2C'2 = b / 


(241 a) 


It should be mentioned that Schrodinger’s theory can be regarded 
as a particular (or rather limiting) case of Pauli’s theory, obtained by 
putting jjL = 0, i.e. by dropping the extra term — pH in K , but pre¬ 
serving the matrix form of the resulting operator //, which can be 

defined as the product of the ordinary operator U an d the unit 

matrix 8. The two functions ifj x and become identical in this case 
except for a constant factor x , which remains arbitrary, and which, 
without loss of generality, can be put equal to zero, the function tp 2 
thus vanishing and reducing to the ordinary Schrodinger function if/. 
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Before proceeding further, we must consider the equation which is 
satisfied by the function-matrix tp\ adjoint to tp. 

The conjugate complex of equation (240) satisfied by ip* is 

(K*+p?)r = 0, (242) 



is the conjugate complex of tp. We shall not, however, in future need 
this matrix, but the transposed matrix ^ ip*}. If the matrix 

elements of K and p t were ordinary numbers (and not operators), we 
could, instead of the preceding equation, write 

*HK'+p}) - 0. (243) 

We shall preserve this equation in the general case, with the convention 
that the operators K 1 and pj —contrary to the rule assumed hitherto—• 
act not on their right but on their left. The same refers to matrix 
operators of any type. Thus, if 

F = f ^ 11 ^ i 2 

Ul 

is a matrix operator acting on \p and Ftp the one-column matrix 

jp, __ 

resulting therefrom, then the adjoint matrix (Ftpy will be defined by 

= in *r+n*:> n +r+n+i)> 

which is in accordance with the usual definition of adjoint matrices. 
The necessity for reversing the direction of the action of an operator 
from right to left in a transition from .F to JF* is due to the fact that 
being a one-row matrix, must always stand as the first factor in 
a matrix product involving it (while tp, being defined as a one-column 
matrix, must always stand in the second place). 

With this convention, the equation for the matrix-function can 

be written in the form 

&.(**-*' t) = 0 

or, since K"< = K\ ^-(K'-K') = 0. (243a) 

This is equivalent to the ordinary equations 

'i’lri(K\-K')+4>UKl = (n-K'W&i+XTJli = 0 , 

VKiKl+KM-K') = = 0 , 

which are the conjugate complex of the equations (241 a) (K' being real). 
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The product of the matrices and «/> is a matrix consisting of one 
row and one column only; it can be treated accordingly as a simple 
number. This number 

<A t '/' = (244) 

can also be regarded as the scalar product of the two-component vectors 
i/r and tp* (or tp^ ). It measures, as we know, the probability-density for 
finding the electron at a given point in a state of motion specified by 
the matrix or vector ip. If the latter is ‘quadratically integrable’, i.e. if 
the integral J ip^ip dV extended over the whole space converges, then 
j/r can be normalized by setting this integral equal to 1. This refers, in 
particular, to functions ip K > belonging to a discrete energy spectrum, 
in which case we can put 

J 'Pk' 'I'k dV = 1. (244 a) 

ft can in addition easily be shown in practically the same way as in 
the old theory that functions ip belonging to different energy values, 
K' and K " say, satisfy the orthogonality relation 

J ft- ft- dV = 0 (A' # A"'), (244 b) 

where ft. ft- = ft-, ft-,+ft -2 ft '2 

is the product of the matrices (or vectors) ip f K . and ip K >. 

We have in fact, multiplying the equation (K~-K')ip K > ---• 0 (on the 
left) by i p f K . and the equation ip^{K—K") = 0 (on the right) by ip K > 
and subtracting one from the other, 

ft.(Aft.)-(ft.At)ft. = (A'— A")ft-.ft.. (244c) 

The two sides of this equation can be considered as ordinary numbers. 

If K were not a differential operator but an ordinary matrix of 

Hermitian character, i.e. satisfying the condition K a p — = K ^ or 

K = K\ then the left side of (244 c) would vanish identically. In 

reality, the matrix K, as defined by formula (240 b), has two component 

parts of the above type—namely, the potential energy IJh and the 

additional magnetic energy —|a-H — —/xo H. In fact, it can be directly 

seen from the expressions (239 b) for the rectangular components of 

Pauli’s ‘spin matrix’ a that + 

1 o T — a. (245) 

The left side of equation (244 c) thus reduces to 

2~ o [ft'(U*ft') - (ft' “ 2t )ft'] = ~ J (ft' a U¥ K a -ft-aU 2 *ft-c,) 
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It should be mentioned that in the ease ifi K * = \fi K » we obtain under 
the div-sign an approximate expression for the current density j. [Cf. 
the derivation of the expressions (201b) in § 25.] 

Multiplying equation (244c) by the volume-element dV and in¬ 
tegrating over all space, we thus get 

(K’-K”) $ </,' h ..f K .dV = 0, 

whence the orthogonality relation (244 b) follows, unless K' ~ K". The 
case of degeneracy, i.e. i[f K , ^ when K' — K\ can be dealt with in the 
new theory in exactly the same way as in the old theory, the SchrOdinger 
‘scalar’ function ip being replaced by the Pauli two-component vector 
(or matrix) ifj. 

The present theory in the above form is a combination of the ordinary 
operator theory and the matrix theory, as developed in the preceding 
chapters on the basis of Schrddinger’s equation. It can be reduced, 
however, to the usual matrix form by introducing the matrix-com¬ 
ponents of the various (two-dimensional) operators F by means of the 
formula r 

' jK-^K-dV, (246) 

where 4 > k- Fi I>k ^I IK-<x F ocfiK'f)> (246 a) 

<x— I p- 1 

is an ordinary number (the ‘scalar product’ of the two-dimensional 
vectors and Fift K *; the latter can be regarded as the product of 
the vector \fs K > and the two-dimensional ‘tensor’ F). 

Replacing the functions by their ‘amplitudes’ with which 
they are connected by the same relation 

<Pk- = <l>K-( x ’y>z)e- i -" K ' llh , 

as in the SchrOdinger theory, we obtain the matrix-elements of F 

n-A~ = 

They are connected with the matrix-components by the usual relations 
F k * k > = F° k . k , e i 2 ir(K"-Kyih (246 b) 

All the theorems which have been established in Chap. Ill with regard 
to the matrix representation of physical quantities ‘from the point of 
view’ of the energy K, remain valid if the latter, as well as the operators 
representing other physical quantities, are defined as two-dimensional 
tensors (or square matrices of the second rank). We have, for instance, 
che usual expansion formula 


( 247 ) 
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which is a direct consequence of the orthogonality and normalizing 
relations for the vector-functions «/»£- and which is equivalent to the 
following two component-equations: 

I Wt = I (« =-- 1.2), (247 a) 

/3=i K- 

that is > - i n- 

K" 

FnK-i + F^k'2 ” I n -K ^K-2- 

A” 

The transformation theory, i.e. the transformation of the matrices of 
various physical quantities from the point of view of K (original energy 
matrix) to the point of view of some other quantity.//, as developed 
in Chap. IV on the basis of SchrOdinger’s ‘one-dimensional’ theory, can 
be applied without any formal modification to Pauli’s two-dimensional 
theory. Introducing the transformation coefficients a }K „ L ', we have, for 
example, the usual equation 

t"v = I (248) 

A" 

which is equivalent to the two equations 

tka - I «K-V tk-a (* = 1, 2). (248 a) 

A" 

To make the result expressed by these transformation equations unam¬ 
biguous, we must affix to the functions </» the index x (short for x, y, z , 
i.e. the rectangular coordinates of the point to which these functions 
refer). We thus get _ 2 (2«b) 

A” 

This equation clearly shows that the index a (which is supposed to 
assume the two values 1 and 2) plays exactly the same role as the space 
coordinates x,y,z. It can be considered accordingly as an additional 
‘fourth’ coordinate, which is usually referred to as the ‘spin coordinate’. 
With this condition, the two functions i/q(x,y, 2 ) and ip 2 (x,y,z), forming 
the components of the Pauli vector (or matrix) tp, can be considered 
as the two values of the same function 0(a, x, y } z) referring to the same 
values of x, y, z and to the two different values a = 1 and a — 2 of the 
spin coordinate. The addition of the latter to the usual three co¬ 
ordinates x t y,z enables one to reduce the two-dimensional Pauli theory 
to the old uni-dimensional form—with one modification only concerning 
the operators F ((kX ) as defined ‘from the point of view’ of the basic 
quantities a, x, y , z. These operators can be defined as ordinary functions 
of the continuously variable quantities x,y>z and of the elementary 
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differential operators p T = p v — they must, 

2m cb; 2m dy 2m dz 

however, be defined as matrices with regard to the discrete variable a. 
In fact, the result of the application of an operator F to a function 
of the type </r(a, y, z) must be another function of the same type 
<f>(P,x,y,z), referring to the same values of x , y, z but not necessarily to 
the same value of cx. Assuming /? to be independent of oc, we see that the 
most general type of linear operator satisfying the condition 

Fi/j(oc,x) = <f>(fi,x) 

can be defined by putting 

F*p(<x,x) — J.Ffa'Pi 

a--l 

where the Fp a are ordinary operators involving the space coordinates 
only. 

It is possible and sometimes convenient to modify the preceding 
notation in the opposite way, namely, by preserving a as a duplicity 
index and introducing similar indices for the two values of all the other 
quantities which are derived from a single value through the action of 
the spin term —fioH in the energy operator K. This refers in the first 
place to the characteristic values of the energy itself. The two values 
of K', which are obtained by the splitting up of a certain characteristic 
value of the SchrOdinger energy operator H' and which, in general, lie 
very close to each other, could be denoted by adding to one of them 
a subsidiary index, k say, assuming the two values 1 and 2, the com¬ 
bination (1, K ') being equivalent to K' + , say, and (2, A') being equivalent 
to KLt where K\ are the two values of K' corresponding to the given 
value of U f . With this notation, the transformation equation (248 b) can 
be rewritten in the form 


2 

'PX’L'; a'x' = 2 Z 0 k'K*\ X'U 4 i K”K"\ a\r'» 
K" K ~ 1 


where K" and L' arc the single values of the energy operators K or L 
unperturbed by the spin term —p.H. 

From this point of view r , the matrix components of an oj>erator F: 

(249) 


F k . k . ; k > k » — j F+ k , k , dV , 

can be grouped together into two-dimensional matrices 

T7 __ ^lK*; 2A") 

■^K'K' ~~ F F 

K”\ 1 K > *2K 0 ; 2 K § * 

which correspond to the ordinary components of the matrix F Ki defined 


(249 a) 
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from the point of view of the Schrodinger energy operator K without 
the spin term. 

The matrix %kK considered in this way—i.e. as formed by elements 
which are themselves matrices—is called a ‘super-matrix’. 

We shall not consider the further development of these formal con¬ 
siderations. The preceding outline will be sufficient for handling various 
problems connected with Pauli’s theory in any one of the three equi¬ 
valent forms, which have just been indicated. The simplest and most 
important of these problems is the approximate solution of Pauli’s 
equation, considering the spin term —fia H as a small perturbation. 
The energy operator resulting from K by the omission of this term will 
be denoted by //; it is equal to the Schrodinger operator u 2 /(2w 0 )-f £/ 
multiplied by the two-dimensional unit matrix 8. In order to avoid 
confusion between this operator and the magnetic field strength, we 
shall denote the latter by §. 

The change of the energy values //' produced by this perturbation 
can be calculated, to the lirst approximation, by means of the same 
equations as in the case of the Schrodinger perturbation theory. In 
doing this we must, however, keep in mind the fact that the unper¬ 
turbed problem is degenerate , each value of H' corresponding to at least 
two different states. It is just this latent duplicity which must be 
revealed by taking into account the spin energy 

S = (250) 


Assuming no other degeneracy to take place (or the matrix elements 
of the perturbation energy S with regard to other states of equal 
unperturbed energy to vanish), we obtain the following equation for 
the first-order correction A//' of the unperturbed energy 


. S^-AH’ S 1 ’ 2 __ o 

£2,1 $2,2—A/P ” ; 


(250 a) 


where = S kI1 , Xh , = J S^. dV, (250 b) 

the indices k , A (= 1, 2) specifying the two degenerate states in ques¬ 
tion. They are used as superscripts in the matrix elements of $ in 
order to distinguish the latter from the matrix elements with regard 
to the spin-index 

~ a p — a x<xf}~^'&v a V<X$'^~&Z 

The two functions i/j Khr (k = 1,2), or rather function-pairs 

(a = 1, 2) describing these degenerate states must be defined with the 

help of the ordinary Schrodinger function ip IVx = i/j in such a way as to 
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satisfy the orthogonality and normalizing relations. The simplest way 
to do this is to put 

4*Ul’;Vx ~ 4'll'x’ 4 t \U’;2x = 0 

4*211'; lx ~~ 0 * 4*211; 2x ~~ 4*11'x I 

(supposing the function i// irx to be normalized). 

By the definition of the spin matrix o [cf. equations (239 b)] we have, 
dropping the indices IV and x , 

(*%) 1 = 8ll4'\l + 8l24’\2 ~ +V'[*>z4'\l — ($>x-t i £>u)4 J \2l 
(% = ^21^1 + ^22^2 = m[(” Sjt + *^)0Ai — S^A2]- 
In the present case these expressions reduce to 

( s 4*i )i = H'$)z 4'> (fy 1)2 = l4~~ bs+ibyW* 

for A — 1, and 

W 2 )l “ —/*(S.r + *$yW r » (^ 2)2 r: "- -~^z4*' 

for A — 2. We thus get, with the help of (250 b) or 

^ x = jlir KX s^ dr: 
sv = 11 j &W dr 
ft rw = -/*[ (Z> z +i$ u )W<IV 

, l . (251fi) 

< t > 2,1 — —M [ (§j — ibuM*'/’ 

S™ = -(x J rfF 

whence, according to (250a) 

(A//') 2 - («».>)*+|.SfWj* # 
since S 2 * 2 = -S™ and S 2 * 1 - S 1 - 2 *, or 

MV =-- ±V((^ M ) 2 + IS 1 ’ 2 ! 2 }- (251 b) 

This formula solves our problem so far as the splitting of the original 
* unperturbed’ energy-level is concerned. The fact that the two sub- 
levels have an additional energy of the same magnitude and of opposite 
sign can be interpreted by assuming that the intrinsic magnetic moment 
of the electron has in both cases opposite orientations varying, in 
general, from one place to another according to the direction of the 
magnetic field. In the simplest case of a homogtneoxis field, the two 
orientations can be shown to be parallel to the latter. 
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We have, in fact, in this case 

= dv = 

^■ 2 = -n(b z +i<b y ), 

so that AH' = (251c) 

where 9) — magnitude of the magnetic field 

strength. This formula is in full agreement with the assumption that 
the electron has an intrinsic magnetic moment of magnitude (jl (Bohr’s 
magneton), which in a homogeneous magnetic field is oriented either 
in the same or in the opposite direction to the magnetic lines of force. 

It can in addition easily be shown that, in the case under considera¬ 
tion, formula (251 c), which has been derived as the first approximation, 
holds exactly. 

For the sake of simplicity, we shall imagine the magnetic field to be 
parallel to the 2 -axis. Pauli's equation then reduces to the form 

(H—fi^G z —K')t/j — 0 , 

which is equivalent to the two equations (cf. (235)): 

— A r/ )l/q = 0, 

(H—ii<b—K')ip 2 = 0. 

If ip ir is the solution of the Schrodinger equation = 0 

corresponding to the unsplit energy-level //', then the solution of the 
preceding system can be put in the form 

(1) K' = 7/'+/X§, l/q = «£ /r , t/» 2 = 0, 

(2) K' = //'— /x$, i/j x = 0, 

The first case obviously corresponds to an orientation in the direction 
opposite to that of the magnetic field, and the second to an orientation 
in a direction coinciding with it (i.e. in the direction of the positive 
2 -axis). 

This indicates, incidentally, that the functions </q and i/j 2 can be 
considered as the probability amplitudes for finding the electron at a 
given point with its intrinsic magnetic moment pointing in the negative 
and positive directions of the 2 -axis respectively. In the general case, 
both of them are different from zero. It is perfectly natural that, under 
this condition, the probability of finding the electron at a given point 
irrespective of its orientation should be measured by the sum |</q| 2 + 1 ^ 2 1 2 * 
We see, further, that the index a which distinguishes the two com¬ 
ponents of the ‘vector’ \jj fully deserves the title of a fourth ‘spin- 
coordinate’; it must be borne in mind, however, that it specifies not 

3596.6 p B 



290 WAVE MECHANICS OF A SINGLE ELECTRON § 29 

the orientation of the ‘spin’ or magnetic axis in space, but only its 
orientation in one of the two senses parallel to a given direction —namely, 
that of the z-axis. 

This interpretation is supported by the form of the expression for 
the average or probable value of the z-component of the electron’s 
magnetic moment, as defined in the usual way by the formula 

fiz = | 

We have, namely, with p z = po z and (< 7 z i/j) 1 = <r !rll 0 1 + a zl2 tft z “ —*^u 
= <^ 21^1 + ^ 22^2 == + ^ 2 > 


fiz = /* / dV. (252) 

In a similar way we find 


fix = P J (0f^2+^*0l) dV 

fin ~ V / (ti'Pi—'f’Z'Pi) dV 


(262 a) 


We thus see that the direct relation of the functions fa and i/j 2 to the 
orientation of the electron’s magnetic moment is limited to the z-axis. 
The two functions i/t*fa and ^fa have complex conjugate values, and 
cannot be associated with a definite direction of the electron’s moment 
parallel to the x- or to the y-axis. 

The quantities 


SR. = Mth+tt'h), = p(fih-tffa) 

(252 b) 

are the components of a certain vector 591, which can be defined as the 
probable magnetization , i.e. the probable value per unit volume of the 
magnetic moment of the ‘electron cloud’ distributed with the density 
^f^i+02^2 ~ P- The vector *81 jp can be regarded accordingly as de¬ 
fining, both with respect to magnitude and direction, the probable value 
of the intrinsic magnetic moment of the electron, supposed to be 
situated at a given point. The magnitude of 27? must, of course, be 
expected to be equal to p. This is easily seen to be actually the case. 
We have in fact, 


m* - w*+w*+w; - m x +m v m x -m y )+mi 

— iW^*+ (<Pt <l>* f+i'l’i '/'f ] = mW^s+^Vi) 2 , 

so that $Jl/p = p. The unit vector 2 Jl/pp thus determines the probable 
direction of the electron’s moment at a given point. 

The physical meaning of the vector 27? is in agreement with the 
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expression ccurlSW in formula (234 a) for the additional current density 
(cf., for instance, my Lehrbuch der Rlektrodynamik , vol. ii, Chap. I). 

In contradistinction to the electron’s position, its orientation cannot 
be specified exactly, so that we must confine ourselves to the deter¬ 
mination of the probable orientation or of the probability of a certain 
orientation (under given circumstances). The formal reason for this 
difference is that the matrices p x , p y , fi 3 or cr x , v y , o z , whose charac¬ 
teristic values should specify the orientation in the same way as the 
values of the coordinates x, y, z specify the position, are not independent 
of each other. 

In fact, multiplying them according to the usual rule of matrix 
multiplication, we get 


0 M 

! 0 

i 


1 

<s>. 

O 

_ij 

r-i 

0 


1 o) 

M 

0 

f“ 

0 i 

M 

0 

1 

| — i<* z 

0 i\ 

(- 1 

0 


•<S> 

O ' 

= 4 

0 

1] 

— icr_ 

i 0) 

l 0 

1J 


* Oj 

1 l 

1 

oj 

X 

1 OJ 

f 0 

1) 

1 = 1 

0 -1 

i •( 

0 

<1 

— ia., 

0 1 1 

l 1 

OJ 


1 0) 

\ 

— i 

'OJ 

f V 


. (253) 


If the multiplication is effected in the opposite order, the same results 
are obtained but with the opposite sign, so that 


(253 a) 
<t. do not 


These equations express the fact that the matrices cr x , cr y 
commute with each other—in contradistinction to the coordinates x,y,z\ 
according to Dirac’s terminology they are said to ‘anticommuto’. Com¬ 
bining equations (253) and (253 a), we get 


etc., or in vector notation 


-VyVx = 


a X a = 2 ia. 


(253b) 


The non-commutability of the matrices a x , cr y , <j z means that the 
values of the quantities represented by them cannot be determined 
(‘observed’ or ‘measured’) simultaneously. It should be mentioned 
that these values are to be defined in the usual way, namely, as 
the characteristic values of the corresponding matrices, regarded as 
linear operators, acting on a two-component function of the type tjj. 
Denoting these values by dashes, we have for their determination the 
equations 

<A* = °'x Ax> <7* <I>V = a u'f'v> 


= ^'A*. 
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or in components 


CT xll£rl + < 7 *12^*2 = 

V* 

II 


' r zn'Pxi+°xi2'Px2 = 

C4 

b" 

1! 

etc., that is, 





PxZ = 

o'xPxV 


f- 

>-* 

II 

A 

to 


tyvi = 

a 'v<Pvv 

- 

-i<Pv 1 = "v'Pv 2 )- 

(254) 

— <Pzl = 

<<pzv 


*Pz2 = a ’zPz2 J 


whence it follows that 





°x 

±1. 

<Px2 

= 1 


a v = 

±1, 

•Pvt 

= - 

(254 a) 


it 1) 

'PzZ 

= T<Pzi ) 



The characteristic values of the rectangular components of the elec¬ 
tron’s magnetic moment p. = p,a are equal accordingly to ip,. This 
means that, in determining the orientation of this moment with respect 
to some axis, we have to assume beforehand that it is parallel to this 
axis, the question to be decided reducing to the choice between the 
positive and the negative direction. In other words, we have to assume 
that the electron’s magnetic moment is quantized about some (arbitrarily 
chosen) axis , the two possible values of its projection on this axis being 
—f- fj. and —ft, while its projection on any other axis remains undeter¬ 
mined. In the preceding theory this role of quantization or reference 
axis has been conferred on the 2 -axis. The theory can easily be 
generalized for the case when this reference axis has any direction 
whatsoever with regard to the coordinate axes. 

These results appear quite natural from the point of view of the 
general transformation theory, developed in Chapter IV. Since the 
matrices cr x , a yy a z do not commute with each other, one of them only 
can be used as a basic quantity, not only for the determination of the 
two others, but also for the determination of the matrix a n representing 
the projection of a on any other direction n . In the preceding theory, 
this basic role has been conferred on a z , which appears accordingly as 
a diagonal matrix, while o x and a y are not diagonal. 

The present case can serve as a very simple illustration of the trans¬ 
formation theory, since we have to do with two states only, the state- 
space thus reducing to a plane in which the two states are represented 
by two mutually perpendicular axes, z+ and say. Replacing z as 
a reference axis (in ordinary space) by some other axis z', we obtain 
two other states (in which the electron’s magnetic moment is oriented 
parallel to z'), which are represented on the ‘state-plane’ by two other 
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mutually perpendicular axes z' + and z'_ (with the same origin as the 
axes z ± ). If the angle between z and z' is equal to 8, then the angle 
between the axes z + and z\ in the state-plane must obviously be equal 
to \8 —since to an angle of 180° between the direction of the positive 
and negative z (or z') axis there corresponds an angle of 90° between 
the axes z + and z_ (or z+ and z'_) on the state diagram. Now, as we 
know from the general theory, the square of the cosine of the angle 
between two axes in the srate-space is equal to the relative probability 
of the state represented by one of them subject to the assumption that 
the probability of the other is equal to unity. Hence it follows that 
if the magnetic moment of the electron is known to be pointing in 
a certain direction (that of +z, say), there is a probability equal to 
cos 2 18 that it will be found pointing in another direction (that of + z') 
making an angle 8 with the former. The probability that it will be 
found pointing in the direction opposite to the latter (i.e. that of — z') 
is equal to cos 2 \(tt—8) = sin 2 J0. We thus see that if the electron’s 
moment is known to point in a certain direction (-f-z), there is a pro¬ 
bability equal to cos 2 \8 + sin 2 = 1 that it will be parallel to any 
other direction (in the positive or the negative sense). This means, as 
stated above, that the direction of the reference-axis to which the 
electron’s moment must be assumed to be parallel can be chosen quite 
arbitrarily. 

Ail these results can be considered as a particular case of those 
holding for the magnetic moment—or the mechanical angular momen¬ 
tum—due to the orbital motion of a (non-spinning) electron in a radially 
symmetrical (central) field of force. As shown in Chapter II, the 
z-component of this orbital angular momentum M z can be assumed to 
be quantized, i.e. to take a discrete set of (characteristic) values mh^rr 
the axial quantum number m varying from —l to +/, where l is the 
angular quantum number determining the total angular momentum 
according to the formula M 2 = h 2 l(l+ 1)/47t 2 , while the x- and y-com- 
ponents of M do not have definite values. The present case can be 
obtained from the general case by taking l equal to \ —i.e. by ascribing 
to the electron, irrespective of its orbital motion, a spin motion of a 
‘half-quantum’ magnitude. We have seen in Chapter III that the 
matrix representation of physical quantities, being more general than 
the operator representation, leaves room both for integral and half- 
integral values of the angular quantum number, subject to the condition 
that the axial quantum number should vary by elementary steps 
A in =3 1 from —l to +J. This vacant place, or rather the lowest vacant 
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step on the J-staircase, can now be filled by the electron’s spin angular 
momentum. The other—higher—steps can be represented by combining 
the latter with the orbital angular momentum—if any (see below). 

The possibility of attributing to the electron, in addition to an 
intrinsic magnetic moment |i, an intrinsic angular momentum s pro¬ 
portional to it, i.e. represented by the same matrix o with a certain 
numerical factor, follows also from the fact that this matrix satisfies 
the commutation relation (253 b) which is quite similar to the com¬ 
mutation relation MxM = — hM/2m satisfied by the orbital angular 
momentum M. Assuming the electron to possess an intrinsic angular 
momentum s = KO (255) 

satisfying the preceding relation, we get 


k 2 o x a = 


— —ko, 
2m 


(255 a) 


or, according to (253 b), k = ^ , (255 a) 

2 2tt 

which means that the magnitude of this momentum corresponds to 
l — as was deduced above from the fact that the electron’s magnetic 
moment can only assume two (opposite) orientations parallel to a 
quantization axis. 

It should be noticed that the formula M — kg, with the above half- 
quantum value of k, does not contradict the result that the charac¬ 
teristic value of the square of M must be equal not to £/* 2 /47 t 2 , but to 
|& 2 /47 t 2 , where | = Z(Z-)-l) with l = In fact, squaring the equation 
s = KO, we get = = 

The characteristic values of s 2 are obtained by substituting the charac¬ 
teristic values of a 2 , a 2 , a 2 . Now from the definition of the matrices 
<J x1 <r v ,<j z > it follows that their squares are equal to the unit matrix 

8 -( 10 ): 

^ a 2 = a 2 = a\ = 8. (255 b) 

The characteristic values of the latter being equal to 1, we thus get 

char, value of M 2 = 3* 2 — ~ . 

4 47t 2 

While the electron’s intrinsic angular momentum k has a half-quantum 
value, its magnetic moment /x = hel<brm 0 c has a whole-quantum value, 
i.e. the same value as the magnetic moment due to the orbital motion 
with the angular quantum number l = 1. The ratio of the magnetic 


char, value of M 2 
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moment to the angular momentum 

fi _ e 

k m 0 c 

is thus twice as large in the case of spin as it is in the case of the orbital 
motion. 

This difference may be reduced formally to the fact that the spin 
matrix satisfies the relations (253) and (253 a), which are responsible for 
the factor 2 in (253 b) and consequently for the factor \ in (255 a) (these 
relations have no parallel in the case of the matrices representing the 
orbital angular momentum). It is the fundamental cause of the com¬ 
plications in the action of a magnetic field on a spinning electron, 
moving in a central field of force, which are usually referred to as the 
‘anomalous’ Zeeman effect. 

Postponing the detailed consideration of the latter till a later section, 
we shall calculate here the rate of change of the total angular momentum 
of the electron due to the couple produced by the magnetic field. If the 
preceding assumptions about the electron’s spin are correct, then we 
must have (so long as the electrostatic field can be supposed to produce 
no couple), according to the classical mechanics, 

ii (L+k«) = 5 -*- (L+2 kct) X «, (256) 

dt ZtyIq c 

where L is that part of the angular momentum which is due to the 
orbital motion. The same equation must hold in wave mechanics if 
L and a are considered as operators and if the time derivative of an 
operator F is defined with the help of the energy operator K by means 
of the formula , ™ 9 . 

~ = [K, F] = ™ (KF-FK). (256 a) 


In equation (256), the operator (or operator-matrix) M — L+ku repre¬ 
sents the total angular momentum of the electron and the operator-matrix 


2 m Q c 


L + - 


m ft c 


2m 0 c 


(L+2 s) 


the total magnetic moment, due both to its motion about the nucleus 
and the supposed ‘spinning’ about its own axis. 

Neglecting the terms proportional to the square of the magnetic 
field, we can put R = H _^. g 

where H is the Schrftdinger energy operator, 

e 


H 


-L(—vV+£7- 

2m 0 \2ni ) T 


2m 0 c 


£ L 
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[cf. (210 a, b), § 26], supposed to be multiplied by the two-dimensional 
unit matrix 8 [eL j(2m 0 c) is the magnetic moment of the orbital motion]. 

The sum of the first two terms of this operator, representing the 
kinetic energy and the potential energy of the radially symmetrical 
electric field, commute both with L and o, so that in the formula 
(256 a), with F = M, we can put simply 

*=-H 4 t « l+ 4 (256b) 

it being understood that L is multiplied by the unit matrix 8. 

Now we have, since a obviously commutes with L, 


[A',L] = 


e 

2 m 0 c 


t(*L),L] 


[JKT, #ccr] = ~/qz[(fc*o),a]. 

For the sake of simplicity, we shall assume the magnetic field to be 
parallel to the 2 -axis (this does not, of course, involve any loss of 
generality). Taking the rectangular components of the bracket expres¬ 
sions on the right side of the preceding equations, we get, with the 
help of the equations LxL = — AL/27U and 0X0 = 2ia, 


[§4,4] = $[4,4] = ?£$(LxL), = -f>L v = -(Lx*), 

[ 54 . 4 ] = §[ 4 . 41 = -jS(LxL), = §4 = -(Lx*), 
[$ 4 > 4 ] = o, 

[§°i- °x\ = “jjp§(® X o) y = -~^cr y = - j(#x *), 


[$°i. <vl = — -y §(« x o) x = y §<T X = - - (o X *)„ 

We thus have, returning to the vector notation, 

[(*'L), L] = —Lx*, [(*-o),o] = -yox*, 

and consequently 

[Z,(L+«o)] = (^L+y^jx*, 

or, since /c = A/47T, 

[JC.d+^ll-^L+^XC. 

which is nothing else but equation (256). 
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Our interpretation of the matrix a as representing a spin motion of 
the electron with an angular momentum k — hjiv and a magnetic 
moment y, = ehj^Trm^c is thus fully checked—at least from the formal 
point of view. One may argue that it cannot have an actual physical 
significance since the electron in the Pauli theory, just as in that of 
SchrOdinger, is dealt with as a point , with definite coordinates y , z , 

and a point-like particle cannot be imagined to be spinning. To this 
one can retort firstly, that Pauli’s theory amounts to the addition of 
a fourth ‘spin’ coordinate, giving a schematical representation of the 
spin motion; and secondly, that the translational motion—in particular 
the revolution about a fixed centre—in wave mechanics is also repre¬ 
sented in a schematical way only. 


30. More Exact Form of the Two-dimensional Matrix Theory; 
Electron’s Electric Moment 

Pauli’s theory, discussed in the preceding section, accounts for the 
duplicity phenomenon in the presence of a magnetic field only, whereas, 
in reality, this phenomenon is observed just as well without such a field. 
A full account of the experimental facts is given by the theory of Dirac 
which we are now going to examine on the same lines. The preceding 
analysis of Pauli’s theory will prove very helpful in the discussion of 
the mathematical form and physical meaning of Dirac’s exact theory. 


If we put 

then equations (229 a 
the following form: 


h = Xv <Pi = X 2 > (257) 

and (229 b) of Dirac’s theory can be written in 


°). 
0 J 


(257 a) 


ou <p+(u t —m 0 c) x — 0 

o-Ux+(«<+w. 0 c)</> 

where a is Pauli’s spin matrix, while the operators u t ^m Q c are under¬ 
stood to be multiplied by the unit matrix h — j J JJ ; tp denotes here 

the tw r o-dimensional matrix and y denotes the matrix 

\ t 2 > IX 2 ) 

Applying to the first of equations (257 a) the operation o u, we get, 
with the help of the second equation, 

(0-u)V+[(o-u)u r ^o*u)]x+(tt / --m o c)o-ux 

= (a-u)V+[(«*»)«|—™<> c )(^+™o<# = 0. 

Now (u t — m 0 c)(^+m 0 c) = % 2 —mge 2 ; we have further, according to 

he 
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§30 


(o-U) J = of «*+... +a* <t„ 

= (« 4 + ul+ul )■ +■ ia t (u x u y —U v u x ) +....., 

i.e., according to (218), 

(a-u ) 2 = u 2 —to~-H = 

2mc 27rc 


Putting, for the sake of brevity, u 2 —wf+mjc 2 — 2), we thus get 

h p 

Zty- - -«(H0-iE x ) = 0. (258) 

tC 

In a similar way we obtain the equation 

hp 

Dx-^*{Ux-im = 0. (258a) 

These equations are equivalent respectively to the second-order equa¬ 
tions (230) and (230 a) of the Dirac theory and could, of course, be 
derived directly from the latter. 

The expressions (232) and (232 a) for the probability density and the 
probability current-density can be written in the form 

p — ( 259 ) 

j = c(^ t ox+X t «</')- (259a) 

In the case of a conservative motion with a positive energy e which 
differs relatively little from the rest energy m 0 c 2 , the functions y can 
be expressed in terms of ip with the help of the relations (233 a) or 

(260) 

which is the approximate form of the first of equations (257 a). 

Using the relation 

(j x (au) = a x cr x u z +o x o y u v +(T x (7 z u z = u x +i<J t u y —i<j y u zi 
that is, o(o u) = u+iu x a, (260 a) 

we get, substituting the expression (260) in (259 a), 

1 i 

j — —con jugate complex, 

2 wIq 2m 0 

which is easily reduced to the approximate form (234 b) with 
2R = fiip'aijt, in agreement with (252 b). As a matter of fact, we have 
merely repeated the argument of § 28, using the new matrix notation 
to illustrate its convenience. 

The equation of Pauli’s theory was obtained from (258) by neglect¬ 
ing the last term (proportional to x) an( l replacing the two terms 
—mjc 2 in the relativistic operator!) by 2m 0 (#-fU). We shall get a 
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better approximation if we substitute in (258) the expression (200) for 
X —which gives an additional term of the second order in 1/c—and 
introduce a correction term of the same order in the expression for I). 
Limiting ourselves, for the sake of simplicity, to the case of conservative 
motion, and putting e = m 0 c 2 -\-K and e' = wi 0 c 2 -fA', we have 

u, - -!(*'-17) = -±(m 0 c*+K'-U). 


This gives D = u 2 —2m 0 (K' — U) — (K'— U) 2 jc 2 , so that equation (258) 
assumes the form 


[u 2 —2m 0 (K'—U) — ~(K' — U) i ]ifi—~£ c e(Hi/j—iEx) = 0. 


Neglecting the relativistic corrections, i.e. putting c — oc, we obtain 
the ordinary SchrOdinger equation 

[u 2 —2wi 0 (A r/ — — 0, 

whence it follows that, with an accuracy of the order of 1/r 2 , we can 
replace the operator (K' — U) 2 /c 2 by u 4 /(2ra 0 c) 2 = (u^.-{-vl+u'l) 2 j(2m 0 c) 2 . 
The preceding equation thus reduces to the standard form 

(K-K’W> = 0, 

with the energy operator 


K = U + - 


U“ — 


wn o 


1 __ 

(2m 0 )V 


-/xo- 


H 


E(o'U) 


With the help of the formula (260 a) the last term in this expression 
can be rewritten in the form 


—u[H-a+E-—i (uxc—»iu)l. 

L 2m o c J 

The operator i’/xE-u represents a purely imaginary quantity whose 
average value vanishes and which can therefore be left out of account.f 
Putting yo = p, we thus get 

A_ =(i“ ,+£ ') +s - <261 > 

where the first term represents the usual (Schrbdinger) energy operator 
(multiplied by the two-dimensional unit-matrix 8), while the operator 


S 


(2«i 0 )W 


u 4 


n n-E 5—,ux|i 


2m 0 c 


(261 a) 


can be regarded as a kind of perturbation energy, which specifies, with 


t In fact the product - - E u ia approximately equal to the work done on the electron 


per unit time, i.e. to —dU/dt; in the case of a stationary motion its average value 
must obviously be equal to zero. 
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an accuracy of the second order in 1/c, the influence of the relativity 
corrections. One of these, represented by the first term in S, refers to 
the variability of mass with velocity, while the other, represented by the 
second and third terms, corresponds to the spin phenomenon. The second 
term, which has been discussed already in the preceding section, can 
be regarded as the additional energy due to the electron’s intrinsic 
magnetic moment p.. As to the third term, it can be interpreted in 
a similar way—namely, as the additional energy due to the presence 
of an electric moment represented by the operator 

1 

v = - UX|A. 

2 m 0 c 

We are thus led to regard the electron as a particle combining the 
properties of a point charge, of an elementary magnet, and of an 
elementary electric dipole , with an electric moment proportional to the 
magnetic moment (p.) and to the velocity of translational motion, 
represented approximately by the operator u/m 0 . 

It should be mentioned that the association of an electric moment 
with a moving particle which is known to possess, when at rest, a 
magnetic moment, is a direct consequence of the relativity theory as 
applied to thexpnnexion between the magnetic and the electric field. 

If we have, for example, in the coordinate system A only a magnetic 
field H (E = 0), then in another system A' which is moving relatively 
to the first with a velocity v' = — v, w r e must have, in addition to 
a magnetic field H' which is slightly different from H (the difference 
being of the second order in v/c), an electric field 

E' = — vxH'/c ~ — vxH/c, 

and vice versa: in the case of the presence of a pure electric field 
E (H — 0) in the system A, there must be, in the system A\ besides 
an electric field E' somewhat different from E, also a magnetic field 
H' = vxE'/c ~ vxE/c. 

Let us consider in the latter case a particle w hich is moving with the 
system A' and which, w r ith regard to this system, possesses a magnetic 
moment p.. It will have accordingly an additional magnetic energy 
U f = — p.H' = —p.*vxE7c ^ (i-v'xE'/c. Now this energy can be 
expressed in the form 

V = c -E'-((ixv) 

U' £ — ^E-(v'X(i) 


or 
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and interpreted as the additional electric energy with regard to the 
system A of an electric dipole with a moment 

1 , 

V = - v'Xu. 

c 

We are thus entitled to assume that a particle which, when at rest, 
behaves like an elementary magnet with a moment p. acquires, when 
moving with a velocity v', an electric moment v'Xft/c. This result 
can be obtained directly w'ith the help of the spinning sphere model of 
the electron, if due account is taken of the redistribution of the electric 
current density produced by the superposition of the translatory 
motion on that of rotation.f 

Replacing the velocity v 7 by the operator u/w 0 , we obtain for the 
representation of the electron’s electric moment the operator 

v — J-uxa, (261b) 

m Q c 

which is just double the previous expression. The additional electric 
energy, reiiresented by the last term in (261 a), must be written accord- 
ingly in the form (261c) 

while the magnetic energy is expressed in the usual way by 

U m = —Hu.. 

The origin of the factor l in (261c) can be interpreted in different 
ways. It can be obtained, in the first place, by applying the relativity 
theory to the spin motion. J It is simpler, however, to connect it with 
the fact that the energy U c corresponds to a second-order effect (w T hile 
U m corresponds to a first-order effect), as in the familiar case of a 
particle possessing no rigid electric dipole moment, and acquiring such 
a moment under the influence of the electric field only. In the present 
case, this influence is an indirect one, proceeding through the velocity 
of translational motion which is maintained b}" the electric field. 

Before discussing the exact theory of Dirac, we shall apply the pre¬ 
ceding corrected form of the Pauli theory to the approximate calculation 
of the so-called ‘relativity corrections’, i.e. of the shift and splitting of 
the energy-levels of an electron moving in a spherically symmetrical 
electric field with or without a homogeneous magnetic field superposed 
upon it. 

f So© my Lehrbuch der Elektrodynamik , vol. i, pp. 295-6. 

f See L. H. Thomas, Nature (1920), p. 514, and Phil. Mag. (1927); also J. Frenkel, 
Zeits.f. Phys. 37 (1926), 273. 
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A. No magnetic field 

The perturbation energy reduces in this case to 


8 = -- 


•E-(pxo) 


(2m 0 ) 3 c 2 2w 0 c ^ ' ' 

where p = hVpliri is the operator representing the electron’s momen¬ 
tum. Putting 

E — —|r, 
r 3 

which corresponds to a Coulomb field of force produced by a nucleus 
with a charge Ze, we get 

N _ v Ze , . Ze t 


E-(pxa) = o-(Exp) 


>(rxp) 


where L — rxp is the operator of the electron’s angular momentum 
(without the contribution ko due to the spin). Substituting this expres¬ 
sion in (2G2) and replacing p 2 j2m 0 by IV—U — IV-{-Ze 2 /r, where IV is 
the unperturbed energy, as given by SehrOdinger’s or Bohr’s theory, 

we £ et 1 XI Ze 2 \ 2 oc 1 

' s =-2 Wo c4r + (2C2a) 


the charge of the electron being denoted by —e. 

The expression (262 a) is somewhat similar to the expression (150) for 
the magnetic perturbation energy, differing from it in the first place 
by the fact that the constant magnetic field &> is replaced by a kind 
of effective magnetic field 

*•» - »£> < 202b > 
which is inversely proportional to the cube of the distance from the 
nucleus and parallel to the vector of the angular momentum L, and in 
the second place by the appearance of the additional term 




which is supposed to be multiplied by the unit matrix 8 


The argument used for the solution of the magnetic perturbation 
problem in the previous section can thus be applied, practically without 
any modification, to the present case; it can be simplified by using from 
the outset a coordinate system with the 2 -axis parallel to the vector 
L (which is a constant of the unperturbed motion). 
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The result is expressed by the formula 




(263) 


where the averaging is to be carried out for the unperturbed motion 
with the help of the usual (scalar) SchrGdinger function «/r specifying it, 
according to the formula F — J Fif/ip* dV. The preceding formula can 
be interpreted by assuming two types of the perturbed motion with the 
electron’s spin axis parallel to the axis of the orbit and having either 
the same or the opposite direction (L o == ±L). The numerical values 
of A H' — A H' ± can be computed approximately by replacing the wave- 
mechanical averages or probable values by the time averages of the 
classical (Bohr) theory. The latter givesj 

I- 1 1-1 1 - 1 

r a’ F"“6 3 ’ 


where a is the semi-major and b is the semi-minor axis of the electron’s 
elliptical orbit. We thus get 


AH' = 


1 

2 m 0 c 2 


H'*+ 


2 Ze 2 H' ( Z 2 e 4 , olL] 
a + ab 


(263 a) 


Now according to the Bohr theory we have further: 

hhi 2 j_ k _ j h j T1t Ze 2 _ 27r 2 m 0 Z 2 e 4 

2 a h 2 n 2 


a — 


47r^^Z* 


» k 
b = - a, 
n 


L -i-’ 


H' 


where n is the principal and k the angular quantum number. Sub¬ 
stituting these expressions in (263 a), we find 




H’*+ 2 ~H'+ ~ = (-3 + H'* 

a ab \ k J 

and 

olL 

Ze 2 h hk n* __ (Ze 2 ) 2 hH* 1 __ 2n^, 2 

~b* = 

47rm 0 277 Fa 3 4a 2 27 Thn 0 Ze 2 kH k 2 

whence 




This formula was originally obtained in 1925 by Uhlenbeck and 
Goudsmit in practically the same way as that shown above, without, 
however, any use of the matrix o (the product L a being replaced by ±2/ 
on the assumption that the electron’s axis can have only two opposite 
orientations parallel to the axis of the orbit). 

By applying relativity mechanics to the stationary states of the Bohr 
j Cf. Bom, Atommeehanik , i, p. 164 (Berlin, 1926). 
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theory, Sommerfeld, in 1915, derived the following formula: 

c nfc = m 0 c*[ l+ ( ^f )2 ] _ ‘ (264) 

which proved to be in exact agreement with the experimental data for 
the energy-levels in hydrogen and ionized helium. Here y is a dimen¬ 
sionless constant 9 2 

y=~~ = 7.10- 3 , (2G4 a) 


s = n—k is the radial quantum number, and 

k' = <J(k 2 -y 2 Z 2 ). (264 b) 

The constant yZ determines the ‘relativity splitting’ of the energy - 
levels belonging to the same value of the principal quantum number 
7i, and so determines the ‘fine structure’ of the spectrum. When 
yZ <? 1, we can replace formula (264) by the approximate formula 


€ nk~ € n 


2 »?, / 3 n \ 

m 0 c 2 \ 4 kj 


(264c) 


where W n = € n —m 0 c 2 = 


m^cy-Z* 

2n 2 


2tt 2 m 0 e*Z* 
hW~~ 


stands for //'. 


This fine-structure formula of Sommerfeld has been brilliantly con¬ 
firmed not only for hydrogen and ionized helium, but also for X-ray 
spectra of the heaviest atoms. The number of lines given by it in the 
latter case (with Jc — 1, 2 ,..., n and with regard to the selection rule 
Ak — ±1)> or the number of energy-levels in the absorption spectrum 
of X-rays comes out, however, too small, being equal to n instead of 
2 n—1, as found experimentally. Thus, for example, we have, when 
n = 2 (1,-group), three energy-levels, while Sommerfeld’s formula only 
gives two (k = 1 and k — 2); when n ~ 3 we have five levels instead 
of three, etc. 

This difficulty was removed by Uhlenbeck and Goudsmit’s theory of 
the spinning electron. To every orbit specified by the numbers n, k 
there are two possible oppositely directed orientations of the spin axis 
perpendicular to the plane of the orbit. Corresponding to these two 
orientations, we must have two different additional energies which 
bring about the doubling of all the energy-levels € nk , according to the 
formula (263 b). 

However, some secondary difficulties remain unexplained by this 
theory: First, one of the levels belonging to the same principal quantum 
number (n) should remain undivided (since the number of different 
levels is equal to 2n -1 and not to 2 n). This can be explained at once 
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if we ascribe to the angular quantum number the values 0, 1,..., n— 1 
instead of 1 , 2 ,..., n, i.e. if we introduce straight-line orbits instead of 
circular ones—because obviously for such straight-line orbits all orienta¬ 
tions perpendicular to the direction of motion are equivalent. It 
should be noticed, however, that the approximate formulae (263b) and 
(264c), as well as the exact formula (264), cannot be applied to the 
case k = 0. 

Secondly, for' hydrogen and ionized helium—briefly in the case of 
atomic systems with a single electron—the experimental data fit exactly 
with Sommerfeld’s formula both with regard to the number and the 
position of the levels, if k is assumed to take the values 1 , 2 ,..., n. 

This difficulty can also be overcome by a more exact analysis of the 
‘splitting due to spin’ and its comparison with that due to the variability 
of mass (‘relativity splitting’ in the sense of Sommerfeld’s theory). 

Formula (263 b) is not valid for k = 0. In general, it is so much 
the more accurate the larger k is. In this limiting case we have 


1 + ■ 1 : 

k * 2k 2 


1 

k±Y 


so that formula (263 b) becomes identical with Sommerfeld’s formula 
(264c), provided k (= n, n— 1, n—2,...) is replaced by k— £, each 
energy-level appearing twice for two consecutive values of k (the one 
increased and the other diminished by \). 

The appearance of half-integral values of k (= n~\ y n—\ y etc.) can 
be explained by the fact that on the wave-mechanical theory the angular 
momentum L is equal to *J{l(l+ l)}A/27r, and not to hk/27T. Now since 
1(1+1) — (l+i) 2 —b we can P u t> for large values of l, 


L = £-(1+1) = +k-l), 

Z.7T LTT 


where l — k— 1 is the angular quantum number of the SchrOdinger 
theory, f 

The average values of 1/r, 1/r 2 , and 1/r 3 have been calculated above, 
for the sake of simplicity, with the help of the old quantum theory; it 
can be shown, however, that the results obtained are not substantially 
altered on the Schrodinger theory if Bohr’s k is replaced everywhere 
by l+h 

We shall see in a later section that the exact wave-mechanical theory 
based on Dirac’s equation leads, in the case of a one-electron atomic 
system, to precisely the same results as the old theory of Sommerfeld, 

f Cf. infra, § 33. 

»w.e » r 
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the spin-doubling remaining unrevealed. It becomes manifest, however, 
as soon as we turn to more complicated atoms in which the motion of 
each electron takes place in a field of force deviating (owing to the 
action of the other electrons) from the purely Coulomb one. This 
follows immediately from the expression (263) in which I jr z itfust be 
replaced by some other (more rapidly decreasing) function of the 
distance, with the result that the two terms of (263)—corresponding 
to the relativistic variation of the mass and to the spin effect—can no 
longer be combined into a single term, corresponding on the old theory 
to the mass effect alone. 

The two states resulting from a single state of the SchrOdinger theory 
and specified by the orientation of the electron’s spin angular momen¬ 
tum in the direction of the orbital angular momentum or in the opposite 
direction are distinguished with the help of a special quantum number 
(formerly called the ‘inner’ quantum number) j, assuming the value 
j = Z+i for the former state and the value j — Z— k for the latter; the 
product of j with hj^lir can be regarded accordingly as the resulting 
angular momentum of the electron. This interpretation corresponds 
rather to the old quantum theory; it can be shown, however, that in 
wave mechanics the number j plays, in regard to the total angular 
momentum M, exactly the same role as the angular quantum number 
l in regard to the orbital angular momentum L. We have, for instance, 
for the characteristic values of M 2 




which can be obtained from the formula M 2 = (L-f s) 2 = L 2 -f 2L s+« 2 , 
where s denotes the spin angular momentum, if we put 8 2 = |^ 2 /4*7r 2 , 
L 2 = h 2 l(l+l)/4TT 2 } and 2L*s = h 2 lji:rr 2 in the case j = l- f | (in the case 
j = Z —l must be replaced by l— 1). 

As has been shown above, for a motion in a Coulomb field of force the 
inner quantum number j also plays the same role as l —in the absence 
of spin—with regard to the energy. 

We shall presently see that this correspondence between j and l can 
be further extended in describing the splitting of the energy-levels 
produced by a weak magnetic field. 


B. Influence of a magnetic field (Zeeman effect) 

The preceding theory can easily be generalized to allow for the 
presence of a homogeneous magnetic field £>. The radially symmetrical 
electric field will be represented by the vector E = /(r)*r. 
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If the unperturbed motion is defined as that corresponding to the 
absence of the magnetic field and to the neglect of the relativity (mass- 
spin) corrections, i.e. if it is specified by the ordinary energy operator 

H — ^multiplied by 8 = j* ^jj, then neglecting terms 

of the second order in $ we can represent the complete energy operator 
K as the sum of H and of the perturbation energy 

where /x = he/(4rrm 0 c) is the absolute value of the electron’s intrinsic 
moment, the electronic charge being denoted by ~e so that 

—cE = —V{7, or f(r)=-~. 

r dr 

This can be written in the form 

S = A + B-o (265) 

with A= -^ H '- vf+ ^ L <265 *> 
and B = —j8L+/i£ f (265 b) 

where = p//(2m 0 c). 

The determination of the energy-levels of the two perturbed states 
resulting from a single unperturbed one can be carried out with the 
help of the general method outlined in the preceding section in con¬ 
nexion with a perturbation due to the magnetic field alone [see equations 
(250)-(251 b)]. We thus get 

AH' = A ±B, (266) 

where A = J i(j*Aip dV and B = ^J{(B x ) 2 -\-(B y ) 2 -{-(B z ) 2 } is the quadratic 
average of the vector B. If L is dealt with as a constant vector (which 
is quite exact for the unperturbed motion), we have 

B = V{W 2 ^ 2 - 2 ^'L+^ 2 S 2 }. (266 a) 

In the extreme case of a very strong magnetic field—such that /x£> > j$L 
—this expression reduces to p£>. Putting, further, £vL = h^m t /2n, 
where m { is the axial (magnetic) quantum number for the orbital 
motion, and neglecting the first terms in (265a) and (265 b) compared 
with the second ones, we get 

AH' — /x§(w 7 ±l), (266 b) 

i.e. the same result as in the case of the ‘normal’ Zeeman effect, corre¬ 
sponding to the absence of spin; the influence of the latter is expressed 
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in the replacement of the axial quantum number m l by m = 
both numbers being integers. 

In the opposite case of a very weak magnetic field (/*£> < j$L) we 
obtain a splitting of a different type, usually denoted as the ‘anomalous’ 
Zeeman effect. Expanding the exact expression (206 a), and neglecting 
the terms of the second and higher orders in £>, we get 

B - pL(l—pfjS?‘Llfi 2 L 2 ) - pL-ufyL/L - 

or, putting L = h(l+ 1)/2 tt and neglecting the ‘relativity correction’ 
(represented by the first term in (265 a)), 

Ml - ±j5£+/*6i»,(l±j_h). (266c) 

where the upper and lower signs refer to the values j = Z-f l and 
j = l — i of the ‘inner quantum number’ which determines the total 
angular momentum M. 

This result in a somewhat different external form involving the axial 
quantum number ra ; - which determines the component of M along the 
magnetic field, so long as the latter is supposed to be weak, can be 
obtained by the following simple argument. 

We have seen above that in the absence of a magnetic field the 
vectors L and s (spin angular momentum) are not constants of the 
motion, even if the latter takes place in a radially symmetrical electrical 
field; the sum L-f-s — M (total angular momentum) is, however, con¬ 
stant in this case. Further, it can easily be shown that the squares of 
s and L remain constant, so that the perturbation produced by the 
spin alone can be pictured as the rotation (precession) of the two vectors 
L and s of constant magnitude About their resultant M (Fig. 3). The 



L 


Fio. 3. 
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average values of s and L must therefore be parallel to M, and can be 
expressed accordingly by the equations 

s = (9~ 1)M, L= (2-j/)M, (267) 

where g is a certain numerical coefficient (s+L = s+L — M). 

It should be mentioned that this ‘graphical* representation of the 
spin perturbation does not give correct results if we assume at the out¬ 
set that the vectors s and L are parallel to each other (in the same or 
opposite directions), as has been concluded previously from equation 
(263). 

The coefficient g can be determined with the help of the formula 
L 2 = (M—-s) 2 -- M 2 —2 M*s+,9 2 if we put L 2 — h 2 l(l+l)/47T 2 , 
M 2 — h 2 j(j~ |-1)/47 t 2 , s 2 = |A 2 /4 t7 2 and replace the scalar product Ms 
by (g-—l)M 2 . This gives 

9 -1 - --■“•vpxrr ) ±--» (267 a) 

that is g-\ = ± ' (j = l±l). (267b) 

The perturbation produced by a sufficiently weak magnetic field can 
be pictured in the same graphical way as the’rotation (precession) of 
the parallelogram, formed by the vectors s, L, M, as a rigid body about 
the direction of the magnetic field, the magnitude of all the three 
vectors remaining thus constant as before. 

The additional magnetic energy can be determined to the first 
approximation as the a verage value of the magnetic perturbation energy 

s m - 5 - e - «v(L+2s) - —— Xv(M-)-s) 

2 m 0 c 2m 

for the unperturbed motion. Replacing s by (</—1)M, we get 

K -- W --- M)- (268) 

-7// 0 C 

The factor g was introduced for the first time by Lande (in 1922). 
It can be interpreted as the ratio of the angular velocity of precession 
of the (s,L) parallelogram about the direction of to the classical or 
‘Larmor’ angular velocity u) = e$/(2 m 0 c), which corresponds to the 
absence of spin. 

The projection of the vector M on £» preserves a constant quantized 
value which can be shown to be given by the formula 

, h 
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where rrij is the axial quantum number. For a state with a given j it 
can assume the 2j-fl half-integral values lying between -f j and —j. 
It thus plays, with regard to j, exactly the same role as the ordinary 
axial quantum number with regard to l in the theory of the spinless 
electron. 

With the help of (268 a) the expression (268) can be rewritten in 
the form , , » 

A//' = ti&mjg m ± ^(268 b) 

It differs from (266 c) (without the term f3L not involving the magnetic 

field) by the fact that m l is replaced by Wj and ^ ^ by This 

difference is, however, easily seen to correspond to the connexion 
between the projections of the vectors L and M on the magnetic field. 
Replacing the vector L in (266 a) by its average value according to 
(267), we get _ h 

&-L = (2—grJSvM = (2 

Z7T 

and consequently 

Air = r&mtf-g) 



instead of the expression (266 c)—or that part of it which is propor¬ 
tional to §. Equating this to \i^gm p we obtain the following equation 
for the factor g: 


whence approximately 




2(0-1) 


: i+V 


which coincides with (267 b). 

Each level, specified by the quantum numbers n,l,j , is split up in 
a weak magnetic field into 2 j +1 equidistant levels with the spacing 


AH : = ^s = ^H 1± sb)'‘ e ' 

where the plus sign refers to the case j = l+\ and the minus sign to 
the case j — l—h 

We have assumed above that the magnetic field was ‘sufficiently 
small’. The standard field with which it has to be compared in this 
sense is the ‘effective’ magnetic field which determines the spin per¬ 
turbation in the case § = 0. This field is parallel and proportional to 
L, as has been shown above [of. eq. (262 b)], and can therefore be 
defined by the formula £» eff = 0L. 
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If § is much larger than £> off , the vectors L and s are no longer 
held together in the rigid parallelogram (Fig. 2), but must be imagined 
to precess independently about the direction of £>, the former with the 
normal Larmor frequency and the latter with twice this frequency. We 
get in this case, instead of (268), 

~ 2m~c ^^^ Sh ^ ~ 

in agreement with (266 b). The modification of the Zeeman effect which 
takes place in a transition from a weak magnetic field to a strong one is 
known as the Paschen-Back effect. 

The preceding results will be established in a more rigorous and 
complete way in a later section on the basis of Dirac’s exact theory. 


31. The Exact Four-dimensional Matrix Theory of Dirac 

The four equations of the Dirac theory, which in the last section were 
written in the form of two matrix equations of the Pauli type, can be 
put in the form of a single matrix equation (they were actually first 
given by Dirac in this form), in a way perfectly similar to that which 
has been applied for the same purpose to the Pauli equations. 

The four functions of Dirac, fa, 0 2 , fa, 0 4 , will be considered accord¬ 
ingly as the four elements of a one-column matrix: 

(M 


, _ 102 


(269) 


(or the components of a four-dimensional vector), the adjoint matrix 
(complex conjugate vector) being 

0 t= = {0f,0?>0?>04*}- (269 a) 

Introducing a suitably defined square matrix of the fourth rank 
(four-dimensional tensor) A we can represent the four first-order equa¬ 
tions (229a)-(229b) as the four components of the matrix (or vector) 


Aip = 0, 


equation 

writing them in the form 

(Alp) x = ‘‘4il0i + ^l202 + ^1303 + ^1404 — 6 
{A\p) % ^2101 + ^2202+^23 03 + ^24 04 ~ 6 

(Alp) z = ^3101 + ^32 02 + ^33 03 + ^34 04 ~ 6 
(^0) 4 =S ^4i 01 + ^42 02+^43 03+-^44 04 “ 6 
Identifying these equations respectively with the first, second, third, 


(270) 


(270a) 




This form of the Dirac equations corresponds to a privileged role of the 
coordinate x , the associated matrix ol x reducing to the four-dimensional 
unit-matrix 8. It is possible, however, to rewrite them in four other 
equivalent forms, corresponding to the shifting of this privilege to one 
of the other four matrices a. This can be done in the simplest way by 
rearranging the original equations (229a)-(229 b) and eventually mul¬ 
tiplying them by —1. For instance, to reduce the matrix <x 0 to 8 we 
multiply the two equations (229 a) by — 1, and rewrite the four equations 
in the reverse order. We thus get 

(u x + iu v )i/j A -—u z </r 8 + (u t +m 0 c )0i = 0, 

(u x ~-iu y )ift z +u z i/r 4 -f (ti,+ro 0 c)0 2 = 0, 

— (u x +iu y )i/j 2 +u z i/j l +(—u t +m 0 c)i/j 3 = 0 , 

= 0 , 

which can be written in the form 


with 

where 


Bi/j = 0, (272) 

B = Px' u x+P v Uy+ P e u z+PtUi+Po rn o c > (272 a) 


/ o o o q (0 0 o t\ (0 o -i o\ 

o 010 ls=]° 0 -* 0 - 0 0 oil 

0 -1 0 0 ’ jo -i 0 O ’ ft jl 0 0 Oj 

k -l 0 0 oj U 0 0 oj lo -1 0 0J 


A = 


0 0 0\ /I 0 0 0\ 

01 0 0 0 1 0 0 

0 0-1 0 ’ Po jo 0 1 0 

,0 0 0 -lj lo 0 0 1> 



(272 b) 
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Rewriting equations (229a)-(229 b) in the inverse order without 
multiplying (229 a) by — 1, we get in a similar way 


rv== 0, 

with r = y x w-x+y v +y 2 w,+y ( +y o m o c > 

where 



f 0 0 0 D 
| 0 0 1 0 1 


^ 0 0 0 i\ 
! 0 0 -i 0 


( 0 0 — 1 0\ 

1 0 0 0 1 

y* = | 

|0 1 0 01 
ll 0 0 oj 

|> Yu - I 

l 0 i 0 01 
l — i 0 0 OJ 


I — 1 0 0 0 

l 0 1 0 oj 


{ 1 0 0 0\ /I 0 0 0\ 

0 1 0 0 0 1 0 0 „ 

0010j’ y ° — i 0 0 — 1 0 

0 0 0 lj lo 0 0 —ij 


(273)- 

(273a) 


(273b) 


This last form of the Dirac equations is especially useful because the 
matrices y arc all Hermitian , while the matrices a and ft are not. There 
is, moreover, a very simple relationship between the Dirac matrices 
Yx> Yy > Yz an( l Pauli ‘spin’ matrices a x , o y , u z which can be expressed 
by the equations 



(274) 


with 0 meaning the two-dimensional zero matrix ^j. The Dirac 

matrices y x , y y , y z can be thus defined as ‘supermatrices’ of the second 
rank, whose elements are constituted by the corresponding Pauli 
matrices and the two-dimensional zero matrices. 

Further, it can easily be shown that the matrices y x , y u , y z , just like 
the Pauli matrices a x , cr yy a s , anticommute with each other and with the 
matrix y 0 , so that putting for the sake of brevity 


Yx = Yv Yy = Y z> Yz = Yv Yo = Yt 

(y t must be left aside, since it is equal to the unit matrix 8), we have 


y y .Yv = —yyy t i (m#*')- (274a) 

A relation of the type (J x a y = ia z , etc., does not hold, however, for the 
matrices y x , y y , y z . We have, for instance, according to (273 b), 


YxYy = 


f—i 0 0 0\ 

0 i 0 0 
0 0-iO 

v o o o i 



which is different from y t . 

3595.6 
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To equations (274 a) we may add the equations 

yl = 8 (274 b) 

which are easily verified. 

It should be mentioned that the four matrices a or ft (which are 
different from 8) also satisfy anticommutative relations of the type 
(274a), while their squares are equal to ±8. We have, namely, 


PI = PI = PI = “ 8 , $ — $ \ 

OTy = <4 = Oil = — 8 , <xj = 8 /’ 


(274c) 


and, of course, j8J = aj = 8 (since £ 0 = cr x = 8). 

With the help of these relations the transition from one form of 
Dirac’s equations to some other equivalent form can be carried out by 
the multiplication of the former by that matrix which must be replaced 
by 8 (with the + or — sign as the case may be). We have, for example, 


A = y x r, B = y o r, A = -ftr*. T = a, 4 


which means that 


«* = y% a u = y*y„> «s = y*y*> “< = y*y< = yx> «o = y*yo; A = y<» 

etc.; these relations can be verified directly. 

We can further easily derive from the first-order equations the 
second-order equations of Dirac’s theory in a similar matrix form. This 
can be done in the simplest way by applying to the equation Bifj — 0 
the operator 

B = ~(P x ^ x +PyU y ^-Pz' u z~^Pt' u t)~\~pQ m o c ' 

We thus get BBtp = 0, or, carrying out the multiplication and taking 
account of the relations (274 a) and (274 b): 

{(ul+ul+ul-u?+mlc t )-[p u p z (u v u x -u z 'u ll )+p c p x (u t -u x -u jc u s )i- 
+PxPyi u x V -U v U x ) + p x P t (u x U t -u,u x ) + 

+p v p,(u v u l -u l u v )+p 2 p ( {u l u l -u l u z )]}<li = 0. 
This equation can be written in the form 

Qi/j = 0 (275) 

with the matrix operator 

1%p 

Q = Dh-£-( H-5+Eor,), (275a) 

where D — u%+ul+ul~~ 

as before, while \ and iq are vector-matrices with the rectangular 
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components 

{ 0 1 0 (h / 0 i 0 0\ f— 1 0 0 (T 

1 0 0 0 t _ 1 — i 0 0 01 . _ j 01 00 

0 0 0 1’^” 0 0 0 i ' * z ~ \ 00-10 

o o i oj { o o oj loo oi; 


(275b) 


We can also write down the relations 
L = tfyPz* £y = 

Vx ~ *ArA> ly — tfyPt* 

or in vector notation 

5 = i*0xp, t) = i 


(275 c) 


Vz — tfiPt 


‘PA = — *Y- 


(276 a) 


The identity of equation (275) with the four equations (230)-(230a) 
is easily verified. 

It should be mentioned that the actual way in which Dirac first 
obtained his first-order equation Bifj = 0 was to some extent the reverse 
of the preceding derivation for the particular case of the free motion when 
the matrix Q reduces to the operator D (multiplied by 8). Assuming the 
possibility of representing Q in this case in the form BB one can easily 
obtain the conditions ftl = = PI — — ft 2 = — 8 and 

(p ^ v ) for the matrices after this the first-order equation Bif/ — 0 
is naturally generalized for the motion of the electron in an arbitrary 
field of force (by replacing p by u) } and finally the corresponding 
generalized expression for the second-order operator Q is obtained in 
the way shown above. 

We have preferred to this straightforward method of Dirac the some¬ 
what more lengthy and complicated path starting with Maxwell’s 
equations, because of the resulting gain in the comprehensiveness of 
the theory. Moreover, the determination of the matrices £ from the 
properties above stated is an ambiguous problem, which can be solved 
only after some assumption has been made as to their rank, i.e. the 
number of wave functions whereas in our derivation this number is 
settled from the beginning with the help of the analogy between 
d’Alembert’s equation and Maxwell’s equations on the one hand, and 
the wave-mechanical equations of the second and first order on the 
other. 
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The four-dimensional second-order equation (275) is equivalent to 
the two equations (258) and (258 a) involving the two-dimensional Pauli 
spin matrix o. The Dirac matrix % can be defined as a duplication of 
the latter according to the formula 



where 0 is short for the two-dimensional zero-matrix 
formula is equivalent to the following three: 



(277) 

This 



which differ from the formulae (274) for y x , y u , y z by the fact that the 
duplication is carried out in the direction of the right diagonal and not 
of the left one. The formulae (274) can be replaced by the single vector 
formula 

Y = 



The vectors y and ? are easily seen to be connected with each other 
by the relations 

Y = P? = ?P> S = PY = YP. (277 a) 


where p is the scalar matrix 

/0 0 1 0 \ 

I 0 0 0 1 I (0 8 

1 0 0 0 | (s 0 

VO 1 0 0J 

which commutes with Y an< ^ anticommutes w r ith y 0 : 


P = 


(p 2 - 1), 


(277 b) 


py o = yop- 

It should be mentioned that y Q commutes with % (since it anticommutes 
both with y and with p). We have further, from comparing (273 b) 
and (273 c): 

Yj = —ip\. (277 c) 

The expression (275 a) for the matrix operator Q can thus be rewritten 
in the form h 


where the factor 8 is to be understood in D. 

It is clear that the matrix % must have in the Dirac theory a similar 
physical meaning to that of the matrix o in the Pauli theory, i.e. it must 
represent, with a suitable numerical factor, the spin angular momentum 
or the magnetic moment. The matrix yj must represent accordingly, 
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when multiplied by p, the electric moment of the electron. An important 
distinction between the matrices £ and tq consists in the fact that the 
former is Hermitian and therefore represents a real quantity (with the 
characteristic values ±1), while the latter is anti-Hermitian and there¬ 
fore represents an imaginary quantity (with the characteristic values 
± 0 - 

This result seems at first sight to contradict the conclusion arrived 
at in the preceding section, namely, that a moving electron possesses 
a real electric moment represented approximately (in the corrected 
Pauli theory) by the matrix u xa/(2w? 0 c). As a matter of fact such 
a contradiction does not exist, for the matrices p§ and pY} represent 
the ‘rest-values’ of the magnetic and electric moments, i.e. their values 
in a system of coordinates with respect to which the electron is at rest. 
In a coordinate system with respect to which it is in motion, the 
electron has an additional imaginary magnetic moment and an addi¬ 
tional real electric moment, these additional moments being numerically 
equal and to a first approximation proportional to the velocity. 

From the point of view of the classical theory, if p and v are the 
rest-values of the magnetic and electric moments of a particle, then in 
a coordinate system with respect to which this particle is moving with a 
velocity v it will have an additional magnetic moment Ap equal, to 
v 

a first approximation, to -x v and an additional electric moment Av 


equal (to the same approximation) to - x p. Putting v — ip we get 

Ap fAv. The numerical equality of the two moments is thus main¬ 
tained for a moving electron (it can easily be shown to hold exactly), 
the imaginary electric moment giving rise to an imaginary magnetic 
one and the real magnetic moment to a real electric one. This real 
electric moment is represented wave-mechanically by the operator 
pu X0 /(fw o c). 

We can now turn to the discussion of the physical meaning of Dirac's 
first-order equation V\fj = 0. We shall note first of all that it can be 
written in the standard form 


( € “hP/)0 “ (278) 

where p t denotes the operator ^ multiplied by the four-dimensional 

matrix 8, and e the first-order energy operator defined as the four¬ 
dimensional matrix 


t = V -\-c(y I u x -\-y ll « 1/ +>';W:)+m 0 c 2 y 0 = V +cyu-f W! 0 c 2 y 0 - (278 a) 
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The important point about Dirac’s equation—namely, its relativistic 
symmetry with regard to time and space—is revealed by the possibility 
of writing it in one of the three other equivalent forms: 

(P x —p x )<fi = 0 , (Py—Py)i> = 0 , (P z —p e )<l> — 0, 
corresponding to the election of one of the space coordinates to the 
presidential role played in the usual form of the theory by the time. 
Replacing the latter by the coordinate x, for example, we get for the 
corresponding ‘momentum operator matrix’, with the help of the equa¬ 
tion A\jj — 0, the following expression: 

Px = G x — OL y U y ~OL s U z -~OC i U l — OL 0 m 0 C ) 

where G x , the ^-component of the ‘potential momentum’ eAJc, is sup¬ 
posed to be multiplied by the unit matrix S. The same refers, of course, 
Ji S 

to the operator p x — — — in the equation (P x —p x )ijj = 0 (as well as 

to the operators p y and p z in the two other momentum equations). 

If the operator c does not contain the time explicitly, then equa¬ 
tion (278) admits particular solutions \fj e > = e^ i2n€ '^ h for which it 

reduces to the form (e—€')«/v = 0. These solutions represent different 
stationary states of the electron moving in a constant electromagnetic 
field. 

It can easily be shown in exactly the same way as in Pauli’s theory 
that functions ip = »/v and */v> belonging to different energy values 
which form a discrete spectrum, satisfy the orthogonality relation 

J #i,dV = 0, 

where ipl ip^ = 2 enables us to build up a matrix repro¬ 

ach 

sentation of physical quantities and a transformation theory which 
differs from that based on Pauli’s equation by the fact that the addi¬ 
tional ‘spin-index’ a assumes four values instead of two. Another 
important difference consists in the fact that Dirac’s equation 
(c— e 7 )*/^ = 0 admits solutions corresponding to negative values of the 
energy c'. This circumstance will be discussed in more detail later on 
(§34). 

It may seem at first sight that the wave-mechanical expression 
(278 a), because it is linear in the operators u x> u y , u z , representing the 
components of the electron’s proper momentum, has no parallel in the 
classical relativistic mechanics. A similar expression is obtained, how¬ 
ever, on jbhe Einstein theory if the proper energy me 2 = w 0 c a /V( ^ — v 2 / c# ) 
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is rewritten in the form 
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me 




where g = wv is the proper momentum. Putting e = mc 2 + U and 
g v = v x g x +v y g y +v e g z , we get 


<= U+v x g x +v v g v +v.g s +m. a c i J(\—v*/c 2 ), (279) 

which becomes identical with the expression (277 a) if we replace the 
proper momentum vector g by the operator u, the velocity vector v by 
the vector-matrix cy, and the expression J(1 —v 2 /c 2 ) by the matrix y 0 . 
We shall write this symbolically in the form of ordinary equations: 

g = u, v = cy, yj(l — v 2 /c 2 ) = y 0 . (279a) 


The startling point about these relations is the fact that the classical 
momentum and velocity are replaced by operators of an entirely dif¬ 
ferent type. This may be due partially to the variation of the mass—- 
which is the proportionality coefficient between momentum and velocity 
—as a function of the latter. If, however, this were the only reason 
for the difference, we should expect the relation 


y 0 u - m 0 cy 

to hold—which of course is not the case (see below). 

The fact that the operators u and cy are the wave-mechanical repre¬ 
sentatives of the momentum and velocity vectors respectively can be 
established in a more direct and convincing way than has been done 
above. Let us consider the classical equation of motion of the 
relativity theory in the Lorentz-Einstein form 

s*-«(<+|v x 4 < 2so > 

Replacing the classical time derivative of g by the wave-mechanical 
expression i 

-u = ~u+[<,u], (280a) 


we get, since u = p—G = p—eA/c, 


dt 


e d 4 
"c8t A ’ 


and further, with the help of the expression (278 a), for the energy 
operator [e ,u] = c[( T u),u]+[C7,u] 

(since y 0 commutes with u). Now 


. [U,u x ] = [U, Px ] = 


8U 

dx 


—e 


~dx 
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[(Y-u),mJ = y x [u x , u x ]+y v [u v , u x ]+y z [u z , u x ] 

2 tt % 

= -J- by( u u u x - u x M u) + U x - u x u z)] 


= e c (y v H,-y z H u ) = C e (vxH) x 

according to (218). We thus have 

[<r,u] = e[—V^+yxH], 

and consequently 

d 

dt ’ 

d 


§31 




or finally 


dt 


u = e(E-+-yxH). 


(280b) 


This equation is of exactly the same form as the classical equation 
(280) with g replaced by u and v/c by y in agreement with (279 a). 

Another still more direct and conclusive proof that the operator cy 
is the wave-mechanical equivalent for the velocity is obtained by 
calculating the operators dxjdt, dy/dt , dzjdt which obviously represent 
the components of the vector v. We thus get 

~ = [«,*] = c[y x u x ,x) 

(since all the other elementary operators constituting t commute with 
*), that is dx 

■j t = c r x [ u x, x ] = c y^Px> x ] = cy*. 

d 

or — r = cy, (280 c) 

which is the desired relation. 

The physical meaning of the operator cy as the representative of the 
velocity can be finally recognized from the fact that, with the expression 

p = (281) 

for the density of probability, following from (232), the expressions 
(232 a) for the probability current-density can be written in the form 

j = cip^yift = c*[i* yb/>, (281 a) 

corresponding to the classical relation j = pv. We have, for instance, 
according to (271 b), taking the ^-component of j, 

ix — ^yx'P = (y* ^)i+$? (y* ^)»+^» (y^) s +#f (WOi] 
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which coincides with the expression (232 a) for j x . Since all the three 
matrices y x , y y , y z are Hermitian, we have y r — y, so that the two forms 
(281a) for j (with y acting on ip and y t on \p*) are equivalent, being 
actually obtained from each other by the associative law of multiplica¬ 
tion. 

The expressions (281) and (281 a) can be derived directly from Dirac’s 
equation Tip - 0, and this in a much simpler way than without the 
use of the matrix notation. Multiplying, namely, this equation (on the 
left) by ipi and subtractingTrom it the product of the adjoint equation 
o by ip (on the right), we get 

iPHTiP)-(iP'T')ip - 0, 

that is, since yj = y 0 and y f — y, 

lp^(u ( lp) — (u*lp^)ip-\-ip^\l'>(ip —(u*</r t )y0 == 0, 

or finally ~(ip^ip)-{-divcip^yip — 0. (281b) 

This is the equation of continuity for the probability density and 

current density as defined by (281) and (281a). 

The expression (281 a) for the probability current-density can be 

transformed (according to Gordon) in the following way. Replacing ip by 

the expression — (p ■u-\~p t u l )ip]m 0 e, with the help of equation (272 a) we 

have j 

j = c</i'y </. =-i/r t [Y(P-u)-(-Yft «/]</', 

m 0 

or, since y = ftp, 

w 0 j = — </' + AP(P'U)i/« — 

We have further, according to (276), 

ftrP- u = PxPx u x+PxPy^ V +PxPz u z = -U x +i(t y U z -Z z U v ), 

that is, P(P u) = —u—iux? 

and pft = —ir\. We thus get 


j = + — fttfttuxS+im,)*]. (282) 

rn 0 m 0 

Transforming in a similar way the factor ip^ (instead of ip) in the expres¬ 
sion j = ^ t y t «/f and adding the result to the previous one, we get finally, 
remembering that 

fit = fit = y 0 , 5* — = tj = ty, 

j = iR(^Vo^)+cnrl(^#VoW)+|(^-^yo^)- 

(282 a) 


T t 
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This expression multiplied by ejc (e ~ charge of the electron) gives the 
density of the electric current (in e.m. units). The latter can accordingly 
be written in the form 

= /^W’VoW)—W ,t V , )y 0 <£]— +curl9K+i^V, 

(282 b) 

where m = iM'p'y 0 %'l< \ (283) 

and = /^4Vo T )‘/' I 

The vector SW must obviously be interpreted as the ‘magnetization’, 
i.e. the probable value per unit volume of the magnetic moment due 
to the electron’s spin. Its components are expressed by the formulae 

= /*[#? ^2 + ^?^)“ (^04 + ^ 3 )] \ 

m y = • (283a) 

W z = —08 03 + ^4 ^ 4 )] j 

If in these expressions we neglect the products of t/; 3 with \p 4 (which 
are small quantities of the second order in 1/c) they reduce to the 
expressions (252b) of Pauli’s theory. Splitting up the matrix <// into 
two two-dimensional matrices \p, we can rewrite (283 a) in the form 

= fJLW<*p-x'°X)‘ (283 b) 

The vector represents the ‘electric polarization’, i.e. the probable 
value per unit volume of the electric moment due to the electron’s 
spin. In spite of its imaginary appearance it is easily seen to be a real 
quantity. We have, namely, 

y v = I (283 c) 

% = i-Min j 

which can also be written in the form 

xW) (283 d) 

corresponding to (283 b). If x * 8 replaced here by its approximate 
expression in terms of ip according to (260) we get, with the help of 
(260 a), 

m 0 c 

in agreement with our previous interpretation of the operator 

v = J!L uxo 
m 0 c 

as the electron’8 real electric moment [cf. (261 b)]. 
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It is interesting to note that the magnetic moment is in Dirac’s 
theory specified by the matrix y 0 % and not by the matrix £ which was 
assumed to specify the mechanical angular momentum due to spin. 
This difference can be interpreted as the expression of the fact that in 
the classical theory the ratio of the magnetic moment to the angular 
momentum is equal to e/(2cm) for orbital motion or e/(cm) for the spin 
motion, where m is not the rest-mass, but the actual mass m 0 /<J( 1 —v 2 /c 2 ). 
If, therefore, in wave mechanics the spin angular momentum is repre¬ 
sented by the matrix h\j±TT, then the magnetic moment must be repre¬ 
sented by the matrix ehy 0 %j (47rm 0 ) — fiy 0 % since the classical quantity 
<J(l—-v 2 /c 2 ) is represented by the matrix y 0 [cf. (279a)]. 

32. General Treatment of the Spin Effect; Angular Momen¬ 
tum and Magnetic Moment 

The fact that the spin angular momentum must be represented by the 
vector s = can be proved in the same way as in the case of the 

Pauli theory (where \ is replaced by a). 

We have, to begin with, according to (275 b), the following relations: 

Uy = Mm = M* = -Mm = ( 2 ^) 

and consequently 1* x? = 2il;, (284a) 

so that the matrix % satisfies the same relations as Pauli’s matrix a, 
giving for the angular momentum s = k\ (k = hjAn) the usual com¬ 
mutative relation sxs = ~hs/(27ri). 

It should be mentioned that the characteristic values of the matrices 
£ xt i u , are equal to ± 1 (each value occurring twice), while those of 
£ 2 , £ 2 , are equal to 1. The characteristic value of s 2 thus turns out 
to be equal to l(hj27r) 2 , as before. 

It can easily be verified that the matrix y 0 commutes with Since, 
further, its square is equal to 1, the preceding relations will hold for the 
matrix y 0 £ just as well as for The necessity of interpreting the latter 
and not the former as the spin angular momentum can be inferred in 
an unambiguous way from the fact that the sum of s == h%/27r and of the 
orbital angular momentum 

L = rxu, (285) 

that is, the vector M == L+s 

satisfies the equation of motion 

| M -rxF, 

where F = e(E+yxH) is the force acting on the electron [cf. (280 b)], 
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and can accordingly be defined as the total angular momentum, while 
the vector L-f-y 0 s does not satisfy this equation. 

We have in fact, d & du 


that is 


dt 


dt 


'-* x " +rx y 

L = cyxu+rxF, 


(285 a) 


according to (280 b) and (280 c). 

Replacing L by s, we get, on the other hand, 

~s = /cc[yu+y 0 w 0 c,|], 

or, putting y == p%> since § commutes both with p and with y 0 , 
d 


= KCp[(%- u),5]. 

Taking the 2 -component of this vector, we have 

K ^p(y , xU>Xi — KCp(ll x (jy Uy^x) 


dt 

according to (284), that is, 


d 

dt 


s — 


-cyxu. 


Adding (285 a) and (285 b), we get the equation 

rxF, 


l< L+8) = l M 


Cp( u x %) z = cu x y 
(285 b) 

(285 c) 


which coincides with the classical equation for the total angular 
momentum. In the case of a spherically symmetrical electric field and in 
the absence of a magnetic field the product r x F vanishes, so that the 
vector M is a constant of the motion. 

Taking the square of M, we get the expression 

M 2 = 2, 2 +2Ls+s 2 (286) 

which is also a constant of the motion. Now since 


is itself a constant, we get 


d 

dt 


(L 2 + 2L-s) = 0. 


(286 a) 


The two terms in the brackets taken separately are not constant; as 
has been shown, however, by Dirac, we obtain a new constant of the 
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motion, characteristic of the relation between L and 8, if we consider 
the vector y 0 L s = L-sy 0 . Taking the time derivative of this vector, 

W6get d„ , dL d. 


^( L-S y 0 ) = 


^■ Syo+L '^ (8y#) - 


Replacing dL/dt by the expression cyxp, according to (285 a) (using 
u = p and r x F = 0) we get 

f.s = c(yxp)'S = CKpfex P) 5 = -CK P (%x%) p 

— — 2icKp%p = —2icK-(Yp), 

and consequently 

^•sy 0 = — 2ic/c(YP)y 0 = — c<f^[(rP).y 0 ] 


since y anticommutes with y 0 , or 

rfL lb r- -1 

^. Syo =-<c_[ t)yo ] 


li? dyp 
87r 2 d/ 


We have further 

d 4 

-(8y 0 ) = -c(Y x p)y 0 + -^isc(p-Y)y 0 = [-(YXp)+»5(p-Y)]cyo- 
Now l-(p-Y) = />5(P 5) = p(P+*PX?) = /JP+*PXY. 

so that ^(8y 0 ) = ippcyo, L ^(Sy 0 ) - 0, 


since L p — (rxp) p =■ 0. We thus get 

d /T x /& 2 d 

^(L*y 0 ) - J t y»’ 

that is, (L s + A^jyo = const. = ^- 2 k, (287) 

where lc is an ordinary number, replacing the angular quantum number 
of the old theory; the fact that it can assume integral values only will 
be shown later on. 

Taking into account the identity 

(L?) 2 = L 2 +i(LxL)-5 = L*~ |-L-5 = -L 2 -2Ls 

277 

and rewriting (287) in the form 

L5 = A(y 0 fc-l) (/„ = 1), 
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we get further L 2 — 2L s = ^Aj (y 0 &—l) 2 , 

that is, L 2 = |Aj y 0 k(ky 0 — 1) = ^Aj 1c(lc—y 0 ), (287 a) 

and L 2 +2Ls = ^Aj 2 (jfc»—l) = const., (287 b) 

in agreement with (286 a). Adding to both sides of this equation the 
term s 2 = f (A/2 tt) 2 , we obtain finally 

= (A) ,( P-i). (287 c) 

The latter expression is usually written in the form 

where j = |i| —J is the so-called ‘inner 5 or ‘total 5 angular quantum 
number. 

An angular quantum number of the same character as that which 
in the Schrfldinger theory specifies L according to the formula 
L 2 = (h/2rr) 2 l(l-\-l) does not exist in Dirac’s theory, since L 2 is not 
a constant of the motion—as shown by the formula (287 a). It should 
be noticed that the number k can assume both positive and negative 
values (which can be interpreted as corresponding respectively to the 
same or to the opposite orientation of the orbit and spin axis), the value 
& = 0 being obviously excluded [as seen from (287 c)]. 

The preceding results, which are strictly valid for the motion in a 
spherically symmetrical electric field, remain approximately valid in 
the presence of a weak homogeneous magnetic field. Such a field £>, 
which can be derived from the vector potential A = $# X r, corresponds 
to the additional term S m = —(e/c)A-cy = —£e($>xr)Y, that is 

4.--£*- (rx «r) (288) 

in the energy operator c. This additional term can be identified with 
the ordinary expression for the magnetic energy if the vector 

H = £rxcr = JerxY 

is defined as the total magnetic moment of the electron. 

We have in this case, according to (286 a) with F = eyx&, 

= erx(YXS). 


(288 a) 
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With the help of the equation 

|[rx(rx«>)] = §x (rx*)+rx(irx*) 

we get, neglecting the left-hand term (since its time-average value 
vanishes), yx(rx£)+rx(YX$) = 0, 

whence, using the vector identity 

rx(yx$)+YX($xr)+$x(rxy) = 0, 
d , 


= Je(rxy)x§ = px£ 

in agreement with the classical theory. 

Taking the scalar product of both sides with &, we get 

d 


dt 


(M\0) = 0, 


(288 b) 


(288 c) 


which means that the projection of the angular momentum in the 
direction of the magnetic field remains constant. 

The formula (288 a) corresponds to the classical formula p = \er X v/c 
for the orbital magnetic moment due to the electron’s translational 
motion alone, without any spin. According to the considerations de¬ 
veloped before in connexion with the spin magnetic moment /xy 0 ? one 
might expect that the total magnetic moment would be expressed as 
the sum * / h \ 

— r ,TXU +m ,t - _ n (rxu + _ 5 ). 

This expression is, however, not exactly equivalent to the expression 
(288 a). 

In order to transform the operator p to an equivalent form of 
the above type, we shall consider its probable or average value 
J dV, which can obviously be written in the form 

ji-i/rxidF, 

where j = is the probability current density. Using the expres¬ 
sion (282 b) for ej/c, we get 

p = —^—R f <//'y 0 rxuif>dV + i f rxcurl dV + - f 

2cm 0 J J c J dt 

Now the first integral is equal to the probable value of y 0 L. With the 

help of the vector identity 

V(AB) = (A*V)B+(B*V)A-f AxcurlB+BxcurlA 
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we get further 

rx curl TO = V(r-9W)-(OT-V)r-(r-V)a» = V(ffltT)-»-r£ W, 

dr 

since 

curlr = 0, (OT-V)r = (r-V) = x/L+y/L + z^ = r|. 


In the latter expression d/dr denotes a partial differentiation with regard 
to the distance from the origin of a polar coordinate system, the two 
angular coordinates being kept constant. Writing the volume element 
dV in the form r 2 drdw, where dco denotes the element of solid angle, 
we have 


r*—*mdr 

dr 


u 

QO 00 

= J dm J ^-(r 3 3}?) dr — 3 J dm J dr 


J 9 


3 Wl dV. 


Consequently, 


\ f rxcurlStt dV — f dV = /xy7| = e y 0 s. 

J J m 0 c 

We thus see that so far as its probable value is concerned the operator 
Ijl is equivalent, at least in the case of a stationary state when the 
expression r 

i rxVdV 


vanishes, to the following one: 

^ = 2m oC yo(L+2S) - 

This ‘effective’ magnetic moment can be replaced approximately by 
the expression 

^ = 2m 0 ' C (L+28) 

not involving the factor y 0 , which accounts for the variation of the 
mass with the velocity, and whose probable value differs by quantities 
of the second order in 1 jc from 1. 

The fact that the expression (288 a) does not contain explicitly the 
spin contribution to the magnetic moment shows very clearly that 
the ‘spin-motion’ has no real existence as something independent of the 
translational motion, but is actually a certain aspect of it. This circum¬ 
stance can be regarded as a consequence of the fact that in Dirac’s 
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theory there is no direct relation between the vector u = p—eA jc 
representing the proper momentum of the electron (mv) and the vector 
cy representing its velocity. These two vectors cannot be treated accord¬ 
ingly as parallel to each other. In fact, the lack of parallelism, as 
measured by the vector product cyxu, can be considered according to 
equations (285 a) and (285 b) as the cause of the change of the orbital 
and spin components of the angular momentum in the absence of 
a magnetic field. 

The fact that the electron’s spin is not an independent kinematic 
property but merely an aspect of the translational motion (resulting 
from the divorce between the velocity and momentum) is indicated 
also by the relation (277 b) between the matrices y and £ representing 
respectively the translational and the ‘spin’ velocity. If the propor¬ 
tionality coefficient p were an ordinary number, then the relation y = p% 
would imply that the two vectors represented by y and \ were parallel 
to each other. Since, however, p is a matrix, such a parallelism does 
not necessarily exist, as may be seen from the calculation of the pro¬ 
bable values of y and %. 

It should be mentioned further that the characteristic values of the 
matrices y x> y v , y z are the same as those of $ x , £ y , that is, +1 and — 1 
(each of them occurring twice). This means that the characteristic 
values of the components of the electron’s velocity as defined by the 
vector cy are equal either to +c or — c. We have here the same type 
of duplicity as in the case of the electron’s spin. For the components 
of the momentum as represented by the vector u we get a continuous 
spectrum extending from — oo to +oo, as in the classical theory. The 
same would refer to the velocity if the latter were defined not by the 
vector cy but by the vector y 0 u/ra 0 , corresponding to the classical 
relation between velocity and momentum. Such a definition is, how¬ 
ever, inconsistent with the relations dx/dt — cy x , etc., derived above. 
It has been shown by V. Fock that, in spite of this, the two definitions 
of the velocity become identical in the limiting case when the quantum 
theory reduces to the classical one (for instance, in the case of a motion 
with very large energy). 

The relationship between the translational and spin motion can be 
interpreted according to Bohr as a particular case of Heisenberg’s 
uncertainty relation, resulting from the consideration of the magnetic 
force experienced or produced by a moving electrified particle without 
any actual spin. 

The magnetic field produced by such a particle (electron) at a distance 

38*M v u 
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r is given by the well-known Biot-Savart formula: 



Now the exact determination of £> according to this formula requires 
the simultaneous knowledge both of the position, i.e. the radius vector 
r of the electron (drawn from the point P for which £> is to be deter¬ 
mined) and its velocity v. This is, however, impossible, since it is only 
possible to measure both quantities at the same time with a limited 
accuracy, so that the products Ax&v x , etc., are at least of the order of 
magnitude of A/m 0 . This implies an inaccuracy 

~ cm 0 r 3 ~ r* V 47rra 0 c) 


in the determination of §, which can be interpreted as an additional 
magnetic field (of unknown direction) due to a particle with a magnetic 
moment fi. The superposition of the magnetic field produced by the 
electron’s spin on that due to its translational motion thus secures the 
validity of the uncertainty relation between position and velocity, so 
far as they can be determined from the electron’s magnetic action. 

A similar result is obtained if we consider the force F = evx£»/c 
experienced by an electron in a given external magnetic field. The 
inaccuracy Ar in the electron’s location leads to an inaccuracy 
A& — (Ar- V)£ in the estimation of the field strength Replacing v 
in the preceding formula by the corresponding inaccuracy At?, we get 

jAJF'l = -AvxAr-V& S—uV^, 

C CTJZq 

which agrees, with regard to the order of magnitude, with the force 
acting on a magnet with moment p in an inhomogeneous field [(|xV)£»]. 


33. The Motion of an Electron in a Central Field of Force; Fine 

Structure and Zeeman Effect 

We shall now turn to the more detailed discussion of the problem of 
the motion of an electron in a spherically symmetrical field of force 
according to Dirac’s theory. 

The function quadruplet i/r v ^ 2 , 0 3 , corresponding to a definite 
energy-level e — e' can be determined in a general way from the equa¬ 
tion (€—e')ip = 0. In the case under consideration it is, however, more 
advantageous to start not with the energy but with the angular con¬ 
stants of the motion and specify the functions 0 so as to make them 
the characteristic functions of the corresponding operators. 
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The most suitable operators for this purpose are M z —the projection 
of the angular momentum operator on one of the coordinate axes—and 
the operator M 2 ; the operator L 2 , although it is not an exact constant 
of the motion, can also serve for the determination of 

Putting = 0, (289) 

we get, from M, = and the definition (275 b) of 

~ 2m c (p 4rr 

£ c , the following system of four ordinary equations: 

1 Hi ],/. __ 1 Ht 


■Hi 

Hz 


C H i» 
c '4 , z> 




+ b*p2 = C '<p2> 


1 dtfi 4 


+ Hi = r Hv 


i c <f) 

1 Ws 

i c)<f> ' ~ rii " ra ’ i i<j> 
where c' = 2irM'Jh is a constant. An immediate consequence of these 
equations is that the dependence of the functions </> 4 on the longitude 
<f> is the same as that of the functions i/»,, </a>. This dependence is 
obviously given bv the formulae 

tjj j = \ 

h = ./b = r 

where A and B are functions of the co-latitude 0, with c' = wi + J, 
that is, j f 

2tt v 


M : = ~(w + J), 


(289 a) 


(289 b) 


m denoting an arbitrary integral number. 

The determination of the functions A, B can be carried out in the 
simplest way by applying to 0 the operator L 2 . This gives, according 
to the relation (287 a), 


1! 


to 

11 


** 

w 

II 


II 

$- 

j 


since 


7o = 


10 0 

0 1 0 

0 0—1 

,0 0 0 



Equations (290) show that the functions 0, so far as their dependence 
on the polar angles 0, <f> is concerned, are spherical harmonics, just as 
in SchrOdinger’s theory. (It will be remembered that L z = — (/&/ 27 r) 2 £ 2 2 , 
where Q? is the Laplacian operator on the sphere, and that the equation 
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ft 2 ^+Z(Z-f 1)0 = 0 is satisfied by spherical harmonic functions of the 
order l > 0.) They show, moreover, that the function pairs 0 2 and 
0 3 , 0 4 are spherical harmonics of different orders, and that the number 
k which determines these orders can have integral values only. We 
must distinguish two cases, namely, k > 0 and k < 0. In the former 
case we get, putting k = l -f 1, with regard to (289 a): 

*1 = «1 ^y,,m + 10. 4) = «1 FP hn , i(0) e *»+W* 

<Pz = «2 <P) = a 2 FP lim (6)e im * I Q & 

<Pz = «3 G7, fl , m+1 (<?, <A) = a 3 SP, +1 , m+1 (0)e’<”<^ ’ 

'Pt — a i 4) = a 4 


where F and # are two unknown functions of the distance r alone, 
while a v a 2 , a 3 > a 4 are certain numerical coefficients. P lin (6) denotes the 
associated spherical harmonic function — sin |w,| 0Pj |m|) (cos0). 

In the case k < 0 we shall put l — —h— |fc|, which gives 


•Pi - fti FY lim+1 = b x FP^ 1 (d)e‘^‘^ 

'Pi = = &2 FPl.miW"* 

'Pi — ^4 YYi-i.m — b.i FP l _-y m (d)e lm ^ 


(290 b) 


where Zq, 6 2 , b 3 , 6 4 are another set of coefficients. 
The number 


Z — £—1 (k > 0) or Z = (fc < 0), 

i.e. the order of the spherical harmonic functions appearing in the 
principal pair i p v 0 2 is called the angular quantum number of the state 
in question. The two states specified by the functions (290 a) and 
(290 b) can be distinguished by their inner quantum number j which 
is equal to Z-f-J in the first case and to l~\ in the second (i.e. in both 
cases to the arithmetic mean of the orders of the spherical harmonics 
in 02,^2 an( l ^ 3 ? 04)* The two states belonging to the same j and to 
different values j ± J of Z are specified by functions of the type 
0i>0 2 ~ Y J+1> 0a >04 ~ Y j~t and 0i >02 ~ 0s >04 ~ respec¬ 

tively. 

The ratio between the coefficients a v a 2 on the one hand and o 8 ,a 4 
on the other can be determined from the equation 

= 0 ]M ri = (291) 

which can serve for the complete determination of the angular factors 
in the quadruplet 0 (inasmuch as the direction of the privileged axis 
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remains unsettled). It is somewhat simpler, however, to combine for 
that purpose the equations (290) and (287). Putting L = h\/27r f we 
can rewrite the latter in the form 


A'^ — ky 0 1 , 

which is equivalent to the system of equations 
(Ax+iA^-A^ = (k- 1 )^ 
(A x —i\ v )4> 1 +K e >fi 2 = (k—l)<p 2 
(K x +i\ y )>f> i -K z >l> 3 = -(k+ 1)^3 
(A x —= — (k-\i)4>i 

We have here 


(291a) 


(292) 


. 1 Id 8\ . 1/8 8\ . 1/8 8\ 

A - ~ A ' = ?(’5 -I 5> K = i{%~ y ei\ 


or in polar coordinates r, 6, <f>, 


A x +iA 


iAy^e^ + icotofy, 






d_ 

dd 


+icote %}• 


A — IJL 
i d(f> 


The first two expressions can be obtained as follows. We shall put for 
the sake of brevity 8/dx = d x > etc. We shall further introduce the com¬ 
plex variable w = x-\-iy and the corresponding derivative d w = d x +id y . 
We get then A x +iA v = z8 w -w8 z . 

On the other hand, we have 


and 

whence 


8e = % 8x+ % 8v+ % 8c = cote ( xd *+y 8 v)- t&n0z8 *' 

x8 x +yd y = w*8 m —i8± = w8*+id^, 

1 


= Zz( xd x+y 8 v- A z) = 


w 


w*cotd 


[ (d e +tan 6 zd e ) —cot 6 A J, 


and consequently 

Ar+^j/ == 


zw 


\w\ 2 GOtQ 


[(< d e +tan 6 zd z )+ cot 6 id^\—wd z . 


- tan 6 = 1, 


= e't, 


I w\~ ’ \w\ 

A x +iA y = e^dg+icotedf). 


Since 

we find finally 
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We thus get in the case of the functions (290a) (k = Z-fl): 

m cot djp,,m = aS+™+l)Pi, m +x 

«i(^+(w+ 1 ) cot ^, m+ i = -a 2 (l-m)P tm 
a *(^g-mcotejP l+1 _ m = —a 3 (l—m+ l)// +1 , m+ i 

as (^ + (Wl+1)COt6 ') i ' +1 ’ m+1 = a ^ l + m +^ P M-m 

These equations can be used not only for the definition of the ratios 
an d a 3 :a 4 but also for the determination of the 'associated* 
spherical harmonics Pl,m> etc. (supposed to be normalized in the same 
way for all values of l and m). Eliminating P^ m+1 between the first 
two equations (292 a), we find, for instance, 


(292 a) 


which is the standard equation for the functions P lm . 

In the case (290b) we get with k =z —l a similar set of equations, 
namely, 


cot dy, m = —b l (l—m)P lrn+1 
b i(^ + ( m + 1 )cot 8j p i,m+i = *> t (l+m+l)P lm 

b^ + im+l )cot ejP ,_ lm+1 = -b t (l-m-l)P,^ m 


(292 b) 


We shall not write down the explicit expressions for the coefficients 
a, b (which depend upon the way the functions P are normalized), and 
shall now turn to the investigation of the radial factors F> 0 and the 
associated question of the characteristic values of the energy e. 

The functions F and 0 can be investigated by transforming the 
equation (e-e')# = 0 to polar coordinates and getting rid of the angular 
factors in t/t with the help of the preceding expressions. 

To carry out this transformation we multiply the term y u in e by 
the square of the 'radial projection’ of the vector y: 



( 293 ) 
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Taking into account the general relation, 

(Y-A)<rB) = (*A)(*B) = A-B+f(AxB)-5, 

we get y\ = 1 and, further, 

y r Y u = - (r-u+*L-?), 
r 

whence yu = y 2 yu = ^(ru+iL*?). (293 a) 

T 

Now for a spherically symmetrical electric field we have u = p, and 
consequently 

i h i/ a , a f a\ he 

r 2m r\ bx by bz] 2m dr 

We thus get, with the help of the equation L ? = h(ky 0 —l)/27T, 

‘ = rr^-—~)+ m '> c2 r'>+ U ’ 

so that Dirac’s equation reduces to the form 

where c () = m 0 c 2 . 

Since the operator-matrix y r commutes with d/br and 1 /r, and anti¬ 
commutes with y 0 , 

(l + * c?( £#y °~* ,+ ~ 0- ( 294 ) 

By the definition of the matrices y xf y y , y e [cf. (273 b)], we have 

(yA)i = h(x+iy)<l > 4 -# 3 ]> WA)t = l[(x-iy)h+Ail 

r r 

(vA ),3 = 2 —*0i]. (yA)i — 

r r 

or, putting >p 1 = <f> v i/> 2 = <f> 2 , <p 3 = Xi> t = Xv and 

1 . X 

°> =5 ~ 

(yA)i = Kx)i> (yA)t = fox)*- (yA )»= (°A)v (yA)* = (°A)t- 

The equation (294) is thus equivalent to the following two: 


(294 a) 
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The latter equation can be multiplied by a r , giving, since a r commutes 
with djdr and since its square is equal to 1, just as for y r , 

1 r~)^ — tF ^ ,+eo ~ U ) ( - Cr = °- (294 b) 


The equations (294 a) and (294 b) serve for the determination of the 
functions <f> and cr r y. It should be remembered that each of these func¬ 
tions represents a pair of ordinary functions. We thus see that the two 
functions of each pair have the same radial factor, in agreement with 
our previous results. Putting 


4> = F(r), °rX = iO{r), 
we obtain the following system: 


( s + ? v ') , + 5<*'+-- l " O “ 0 

Using tjie identity (}r+') F “ f S i,r> ' 


we have 




(295) 


(296) 


where 


9 = rG, f = rF. 


We shall solve these equations for the particular case of the hydrogen¬ 
like atom, i.e. an electron moving in a Coulomb field with a potential 
energy U — — Ze 2 /r. We shall assume that c' < e 0 , which corresponds 
to a bound electron ( H ' < 0) and leads to a discrete set of energy-levels. 
Putting, for the sake of brevity, 


y(e'+*o) 


27r 


(*0-O = 0 2 > 


2 t rZe 2 
he 


we get for this case 


n* 1 ^ 

1 

|/+l 

K 

|» = 0 1 

(i+*| 

\drrt 

M 

K 

l/-«J 


(296 a) 
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For large values of r these equations reduce to 

o, 

giving the following asymptotic solution: 

/= Ae~^ r , g = Be~«P r \ 

Ap=Boc r 

where A and B are considered as constants. 

To get the exact solution we replace them by polynomials 

A = A 0 rt i +A l rt l+1 +"-+A 9 ri x+8 , 

B = J» 0 r#*+J? 1 rM+i+...+^r^, 
obtaining the following relations between the coefficients: 

A n (p+n—k)+yB n = ocpA^—oPB^ \ 
^n(^+ n +k)—yA n = ocpB n _ l ~P 2 A n ^ l j 

Multiplying the first of these equations by ft and the second by a and 
adding the results, we get 

— k) — (xy]A-B n [oL(fi-\-n-bk)~\-py ] = 0. (297 a) 

The ‘boundary conditions’ A n = B n = 0 for n = — 1 and n = 5+1 
applied to (297) give 

A Q (n—k)+yB l = 0, ^(/x+A:)—= 0; 

pA 8 = 

Eliminating u4 x and B x between the first two equations, we get 

ft = +V(* 2 -r 2 ). (297 b) 


(296b) 


(297) 


The ratio 


^_V(* 2 -r 2 )+fc _ _y_ _ 

B 0 y k-J(k*-y*) 


which follows from the preceding equation, is identical with that which 
is obtained from (297 a) for n = 1. With n — 5 we get, on the other 


hand, 


a ,[P(i j -+ 8 —£)— 1 “y]+-B«[“(^+ a +*)+M = o, 


which becomes identical with fiA s = ocB s on using the condition 


2cxp(ii+s) = (oc 2 ~p 2 )y. (297 c) 

With the above definitions of a, ft we get 

^ 4 - 7 ^+ s ) = e'y, 

X X 


8590.6 
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that is, from (297 b), 

,298) 

This is exactly Sommerfeld’s formula (264) (with yZ replaced by y).f 
The angular quantum number k has the same meaning in both cases, 
so far as the value of the energy is concerned. It must be remembered, 
however, that in the previous theory it was supposed to be essentially 
positive, whereas in Dirac’s theory it can assume both positive and 
negative values (zero excluded). With k > 0 we get l = £—1 and 
j = = k— i.e. a solution of the type (290 a); while in the case 

k < 0 we obtain a solution of the type (290 b) with l = \k\ and 
j = i*i-f 

It should be emphasized that the two solutions are characterized not 
only by different angular factors, but also, as is plainly seen from (297), 
by different radial factors F = f/r and O g/r; their similarity is 
restricted to the value of the energy and of the 2 -component of the 
angular momentum M z . 

The coincidence of the energy-levels corresponding to opjiosite values 
of is a characteristic feature of the motion in a purely Coulomb field 
of force. If the motion of the electron takes place in a field even 
moderately deviating from the latter, due, for instance, to the variable 
shielding action of the inner electrons in an alkali atom, the energies 
of the states + k and —k become different and we obtain what is called 
a ‘screening doublet’. The two levels of such a doublet state belong 
to two different values of the Schrodinger angular number l, namely, 
l — | A: | — 1 and l = |fc|, and to the same value of the inner quantum 
number j — \k\ — J. It should be mentioned that in the case of small 
values of j the separation between the two energy-levels in alkali atoms 
or ions of a similar structure is so large that they are no longer con- 


t If instead of Dirac’s equation we used the relativity second-order equation Dtf* — 0, 
in the present case . 0 „ 

vv+ Aiir*[(' ,+_ r) = o. 

not involving the spin, we should have obtained a solution of the same type 

+ =F{r)YU0.+) 

as in Schrodinger’s theory, with 

rF — / = e~ ar 2 
«*—o 


and [ 1 + [«-j+vd+*)*-/H i ] * 

corresponding to half-integral values of the radial and angular quantum numbers (j— } 
instead of a, and Z-f-1 instead of l). This result is, however, contradicted by the experi¬ 
mental data, which are in agreement with Sommerfeld’s formula. 
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sidered as forming a doublet and are refeiTed to different series. This 
notion can, however, be conveniently applied to X-ray absorption levels; 

The two levels corresponding to the same value of the SchrOdinger 
angular quantum number l and to consecutive values of the inner 
quantum number j = l—\ (k = —l ) and j — l-\-\ (k — Z+ 1 ) are said 
to form a ‘relativity doublet’. According to Sommerfeld’s formula (298) 
they correspond to consecutive values of the old angular quantum 
number |&| (= Z, Z+l). Since in the Bohr-Sommerfeld theory this 
number determined the eccentricity of the elliptical orbits, the relativity 
doublets were associated with orbits of different eccentricity. From 
the point of view of the present theory, the relativity doublets should 
be associated rather with orbits of the same size and eccentricity but 
with opposite orientations of the spin. Such relativity or ‘spin’-doublets 
are extremely narrow in hydrogen or ionized helium, but they become 
very broad in X-ray spectra, their width increasing roughly as the 
fourth power of the effective nuclear charge [according to the approxi¬ 
mate formula (264 c)]. They are rather broad, too, in the spectra of 
alkali atoms and other complicated systems with one external electron. 
In this case, however, they are due not to a large effective nuclear 
charge, but to a rapid variation of the latter, owing to the decrease of 
the shielding effect of the inner electrons when the outer electron 
approaches the nucleus.—Sommerfeld’s formula is, of course, inap¬ 
plicable to this case, which is characterized by a large AZ-separation 
(‘screening effect’) and a relatively small Aj-separation (‘spin’ or 
relativity effect). 

To a given value of k (i.e. of Z and j) there corresponds a degenerate 
set of states specified by different values of the axial quantum number m 
or of the number = m +1 which determines the z-component of the 
total angular momentum. This degeneracy is of exactly the same 
type as that discussed before in connexion with SchrOdinger’s theory; 
it can be pictured as due to the possibility of 2j+l = 2\k\ quantized 
orientations of the angular momentum vector with regard to the z-axis, 
corresponding to all half-integral values of ra+J between +j and — j . 
We have in fact in the case Zb > 0 a set of function-quadruplets ip with 
the following angular factors Y k _ lttn + V Y k _ lm ; Y kfjn + V Y ktm . The maxi¬ 
mum or minimum admissible value of m is that for which one function 
at least of each pair is different from zero. We thus get m < k— 1 and 


m > —k f i.e. 


—&+£ < w+i ^ k —J. 


A similar relation with k replaced by |fc| is obtained in the case k < 0. 
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Thus, for example, in the particular ease k = 1, l — 0 and j — 
which corresponds to the normal state of the hydrogen atom (n = I; 
it should be mentioned that the case k = — 1 , i.e. I = I, corresponds 
to an excited state 2 ) we actually obtain two sub-states specified 
by the following expressions for the functions 0 l5 ..., 0 4 : 

*P a = RY a («= 1 , 2 , 3 , 4 ), 

with the radial factor 

R(r) = fVa-y*)- 1 e~ r l a 9 


and the angular factors 

Fi = 0, Y 2 — 1 , Y 3 = 


■ v—sinfle^, 

i+V(i-y 2 ) 


*4 = 




i+V(i-y 2 ) 


- cos 0 


in the case m = 0 , i.e. mj = +£, and 

F x = -1, Y z = 0 , F 3 — — 


Yi i+V(i-?) 


i+V(i—y 2 ) 

sin 0 


cos 6 , 


in the case m — — 1 , i.e. m j = — J. The two states correspond to the 
same value of the inner quantum number j, namely, j = |. They 
are associated with the same spherically symmetrical distribution of 
the probability density, which is proportional to the square of the 
radial factor B(r). It should be noticed that this factor becomes 

oo 

infinite at r = 0 , but in such a way that the integral J J? 2 r 2 dr remains 

o 

convergent. 

The difference between the two states consists in the fact that for 
the first of them the spin axis of the electron is pointing in the positive 
and for the second in the negative direction of the 2 -axis, as follows 
from the approximate equation for the characteristic values 

with 03 = ^ 4 = 0 . 

We must consider in conclusion the modification of the states, and 
in particular of the energy-levels, of a hydrogen-like or an alkali-like atom 
in the presence of a homogeneous magnetic field £j (Zeeman effect). 
In the former case we have to deal with a twofold (k t —1c) degeneracy, 
corresponding to the absence of any screening effect. This degeneracy is 
to be taken into account for very weak magnetic fields only, so weak 
that the product is very small compared with the relativistic (A j) 
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separation. Jn the latter case, on the contrary, the relativistic splitting 
is as a rule much smaller than the screening (±&) separation, so that 
for fields of moderate strength the only degeneracy present is that 
which corresponds to different values of the axial quantum number m. 

It can easily be shown that the characteristic functions ip corre¬ 
sponding to this privileged character of the 2 -axis in the absence of 
a magnetic field are such that the non-diagonal matrix elements of the 
magnetic perturbation energy 

S = |e»rxy = \e9)(xy v -yy x ) (299) 

all vanish. So long as the magnetic field is sufficiently weak the addi¬ 
tional energy due to its action can be determined accordingly as the 
diagonal elements of 8 with regard to the corresponding unperturbed 
states. 

The additional magnetic energy of a state specified by the quantum 
numbers k , m is thus given by the formula 

= 8^ = J 4>L dV. (299 a) 

Dropping for the sake of simplicity the indices k, m, we have, according 
to (299), 

(Stp) i = \e§i(x+iy)xp v ( Sip) 2 = —\e%>i{x-iy)ip z , 

(Sip) 3 = \e9)i(x+iy)ip 2y (fif^) 4 = —\e9)i(x-iy)ip v 

and consequently 

= -e$R 

or = — e§R~ rsin0e i ^($ty 4 +<AsV , 2)- (299 b) 

% 

Substituting here the expressions for the functions ip derived before 
and integrating, we get 

^ € km 

= 27re§ J dr F(r)Q(r)r 3 J i(a*a 4 P z+1 , m P/, m+1 +<«4 P l+1 , m+1 P,, Jsin 2 0 d6 

(299 c) 

in the case of the equations (290 a) and a similar expression in the case 
(290b). 

The radial factor in this expression can easily be calculated with the 
help of the differential equations (296) which are satisfied by the func¬ 
tions rF = / and rG — g. Taking the first of these equations and 
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putting approximately e'+e 0 -—£7 ^ 2c 0 , we get 


I 33 


whence 


or since 


J FGr* dr = J fgr dr s J /* rfr-J rf d ± drj, 


f FGr 3 dr » ---(*+*) f / 2 rfr = - i - (*+ J) 

J 47re 0 J 47rm 0 c 

o 

if the function/(r) is appropriately normalized f 2 dr ~ 1). 

The angular factor in (299 c) can also be evaluated without much 
trouble with due regard to the normalizing conditions for the functions 
P(d). 

We obtain in this way (neglecting terms of the second order in 1/c) 

Ae ' km = = -**&?(»»+£)> ( 30 °) 
47 rm 0 c 

with g = Jt- (fc > 0), (* < 0), (300a) 

in agreement with the results obtained at the end of § 30 (if m-\-\ is 
identified with rrij). 

The integration of the expression (299 c) requires a great deal of 
calculation. This can be avoided, however, if we replace the operator 
M by the operators 


M e „ = ^(L+2s) or M 


2 m 0 c 


off 


2 m 0 c 


— (L+2s), 


which have been shown in the preceding section to be approximately 
equivalent to it and to each other with an accuracy of the second order 
in 1/c. To the same approximation we can replace y 0 in the expression 

(287 a) by 1, with the result L 2 = — k(k—l) = —J(£+1) when k > 0. 

477 * 47 T £ 

( h \ 2 

— 1 j(j+ 1) and putting 


8 = (g— 1)M, we obtain, with the help of (267 a) and (289 b), the above 
approximate expression for A^ m . 

The preceding theory is applicable only to a comparatively weak 
magnetic field. When the shift of the energy-levels produced by the 
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magnetic field becomes of the same order of magnitude as the Aj-doublet 
separation, the spin perturbation to which this separation is due must 
be taken into account together with the magnetic perturbation. 

We must start in this case with the two unperturbed states of equal 
energy cj m , specified by the same values of l and m and belonging to 
the values j ~ \ of the inner quantum number. The combined spin- 


magnetic perturbation S 


produces a splitting-up of the 


unperturbed energy-level into two levels 

Imj- 

equation \8 U -W 8 a I _ ft 

Ct Cl A / 


according to the 


where the index 1 refers to one of the two degenerate states (j = Z+$, 
say), and the index 2 to the other (j — Z— J). 

The non-diagonal elements of the spin perturbation {S sv ) }2 and ($ ap ) 21 
must obviously vanish since the states^* — ZJbi are stationary in the 
absence of the magnetic field. The diagonal elements ($ 6p ) n ™ A^', 
(S bv ) 22 = A 2 c' can be defined therefore as the additional energies due 
to the spin perturbation alone, their difference 8 = A 1 e'—A 2 e' being 
equal to the Aj-doublet separation in the absence of the magnetic field. 
The action of the latter can thus be determined by the equation 


s m n—A lf '-Ac' 

$ml2 _ 

Sm2l 

$rn22~&2 € 


Sm “ = 

$m!2 = Sm21 ” 


2 (i+b 


( m +h) 


(301 a) 


The first two expressions are given by (300); the expressions for S ml2 and 

S m21 can be derived in a similar manner [see § 20, equation (155 b)]. 

It is customary to refer the displaced energy-levels e[ and c 2 to the 

‘centre of gravity" of the doublet, i.e. to the energy c p determined by 

the formulae , , . n , , , , 1Q 

*1 = *o+(Z+l)p, € 2 = € 0~ l P 

[8 = (2Z+l)/3 = e[-c']. Putting A x c' = (1+1)0, A 2 c' = -Z0, and 
c '—€q — Ac', we obtain from (300) the following equation for Ac': 

(Aty+lP+fjiW m+l)JA€'~Z(Z+l)^+(^) 2 m(m+l) - 0. 

Its solution runs 

Ac' = — | p—p&im+Ddz 

±<]{b*&(m+l)+lP] 2 +P 2 (l+l)l~lSbM™+l)}. (301 b) 
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If the magnetic field is very weak, we get, in the first approximation, 

At' = 


i.e. 


and 


Ac'= -0(1+1)-,*$ j£j(«+|). 
At' = +/B- / iS J -L(m+i), 


in agreement with (300). In the opposite case of a very strong magnetic 
field—so strong that the doublet distance 8 is small in comparison with 
the splitting /x$ due to the field alone (when 8 = 0)—the formula 
(301 b) reduces to 


Ae' = — /x§(m+— “p§(w±l), 
i.e. to the earlier formula (266 b) which determines the normal Zeeman 
effect. 


34. Negative Energy States; Positive Electrons and Neutrons 

We have seen above that in Pauli’s theory the two values a = 1 and 
a = 2 of the spin-coordinate refer to the two opposite orientations of 
the electron’s spin or magnetic axis parallel to the z-axis. One might 
be inclined to think that the values a — 1, 2, 3,4 of the Dirac theory 
refer to four different orientations of the electron. This is, however, 
not true. Taking the probable value of the spin angular momentum in 
the z direction we get, according to (275 b): 

«. “ = h J dV > 

which shows that the values a — 3 and a — 4 refer to the same orienta¬ 
tions (in the negative and positive direction parallel to z) as the values 
a = 1 and a = 2 respectively. 

It should be mentioned that we get exactly the opposite result as to 
the meaning of a = 3 and a — 4 if, instead of the angular (mechanical) 
momentum, we consider the magnetic moment due to the spin |x = fiy 0 %. 
We get, namely, in this case [cf. (283 a)]: 

W, = J 2MF = /X J tt'l'i) dV - 

This shows that in the states a = 3,4 the electron behaves, so far as 
its spin magnetic moment is concerned, as a particle with a positive 
charge. 

As has been explained already, the quadruplicity of the Dirac theory 
is connected with the introduction of states of negative energy c. The 
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values ol = 3,4 for a state of this type have the same physical meaning 
as the values a — 1,2 for the corresponding state of positive energy 
(the functions t/r 3 , */r 4 being large compared with ip v ip 2 in the former 
case and small in the latter). The quadruplicity appearing in the com¬ 
parison of SchrOdinger’s and Dirac’s theory can be pictured as the 
result of the reflection of a point representing a Schrodinger state in 
the plane c = 0 and further as the splitting of the two points into 
a Pauli doublet. 

To each characteristic value of Schrttdinger’s energy constant H' 
there correspond in Dirac’s theory four energy values e' which can be 
denoted as follows: 

m 0 c 2 +tf'.[, ™ 0 c 2 +tf'j; (> 0), 

m 0 c 2 -f//' + , m 0 c 2 +H'z (< 0), 

the first pair lying close to each other as well as the second pair, the 
two pairs having approximately opposite values. 

The matrix elements of any physical quantity represented by the 
four-dimensional matrix-operator F, as defined by the general formula 

= J </>:• F'h dV = J i dV 

can be combined accordingly into four-dimensional matrices: 

Fr* \ir\ F H .\n' t F n ,\ jr ~ F irir 
Fr'Iii'* F ir + H 'i F h „\_ h - F u ~\ jrz 
Fu'-irl F ir - Irt F h .- u ,- F n .- Ir - 
Fr”zh[^ F irzir + F H . Z11 '- F jrzrrz 

If the function Fip ( > is expanded in a series of functions t/r c », according 
to the formula ~ „ . 

€" 

negative energy states must be taken into account as well as the 
states of positive energy unless the matrix elements F € . € * f where e' > 0 
and e* > 0, all vanish. This circumstance is especially important in 
various perturbation problems; with F denoting the operator of the 
perturbation energy, correct results as to the probability of combined 
(double) transitions are obtained only if intermediate states of negative 
energy are considered along with those of positive energy. In the 
problem of the scattering of light by a free electron, for example, the 
relative importance of intermediate states of negative energy is larger 
the smaller the (positive) energy of the initial and final state. This 

3095.6 y \r 
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result (due to Tamm) is especially startling because relativity corrections 
vanish in the limiting case of small velocities, so that negative energy 
states which form a characteristic relativity effect would be expected 
to become insignificant in this limiting case. 

Another interesting example of the paradoxical role played in Dirac’s 
theory by the states of negative energy is presented by the motion of 
an electron through a potential energy jump, as discussed by 0. Klein. 
For the sake of simplicity we shall take the equation of the second 
order, Dip — 0 (D = u 2 —uf+mlc 2 ), to which the four equations of the 
Dirac theory reduce for free motion. The continuity conditions for the 
four functions can be replaced in this case by the continuity 

condition for one of them and its derivative in the direction of the 
energy jump. Assuming the latter to take place in the direction of 
the x-axis, the potential energy being equal to 0 on the left of the 
plane x — 0 and U — const. > 0 on the right, and assuming further the 
electron to move parallel to the z-axis, we get 


i2n u 

A'e h 


+A"e 


i2TT, 

h 


for x < 0 (incident and reflected wave), and 


for x > 0 (transmitted wave), where 

gl = c 2 /c 2 —m 0 c 2 and g\ = (e— U) 2 /c 2 —m 0 c 2 . 

The continuity conditions give the same relations A'-j-A" = B' and 
A'—A" = B'g b lg a as in the non-relativity theory [cf. Part I]. The 
important difference between the latter and the present theory con¬ 
sists in the fact that the above relativity expression for g b remains real 
not only in the case when U is smaller than the kinetic energy of 
the incident electron c—m 0 c 2 , but also in the case when it is larger 
than w 0 c 2 +c ^ 2m 0 c 2 (if e is not very different from m Q c 2 ). This 
means that total reflection (g b imaginary) takes place only within the 

range e—m 0 c 2 < U < e+m 0 c\ 

whereas beyond it we get transmission both for small and for large 
values of U. 

It seems hardly possible to give a reasonable interpretation of this 
result. It can be shown, however, that the paradoxical transmission 
probability for the case U > e+^o c2 rapidly decreases when the dis¬ 
continuity U in the potential energy at x = 0 is replaced by a gradual 
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increase within an interval comparable with or larger than the wave¬ 
length of the electron A = h/g. 

The physical meaning of the states of negative energy is at present 
not quite certain. They were initially interpreted by Dirac in con¬ 
nexion with the duplicity of electricity, and served to reduce protons to a 
mere absence of electrons if space is assumed to be nearly saturated with 
electrons in states of negative energy, with due regard to Pauli’s ex¬ 
clusion principle. It is, however, impossible to interpret in this way the 
difference in the mass of electrons and protons. According to Pauli and 
to Weyl the rest-mass of a proton considered as a hole in the distribu¬ 
tion of electrons with negative energies should be exactly equal to the 
rest-mass m 0 of an electron. 

Although Dirac’s original theory has thus failed to reduce protons to 
electrons, yet it may perhaps be credited with predicting the existence 
and properties of things that have hitherto never been anticipated by the 
experimental physicist and that seem to reveal themselves in the Wilson 
chamber cloud-tracks of particles released by the penetrating rays of 
cosmic origin and by very hard gamma rays. These are the ‘positive 
electrons’ w r hose discovery has recently been announced by Anderson 
(1932) and also by Blackett (1933). 

The experimental data are still too scarce to make it sure that positive 
electrons really exist. But if they do exist they fit beautifully in the 
scheme of Dirac’s theory. The fact that they are not found under 
ordinary conditions is explained by the extremely large probability that 
a ‘positive electron’ will recombine with a negative one (the latter 
falling from a state of positive energy into the hole constituting the 
former), this recombination being accompanied by the emission of two 
photons (cf. Part I, § 19). 

The visible existence of the material world around us must be 
guaranteed from this point of view by the fact that the total number of 
electrons is larger than the number of available states of negative energy, 
at least in that part of the world which is accessible to observation. 

Assuming the existence of positive electrons, it would be natural to 
postulate the existence of Negative protons’ formed by holes in a 
practically saturated distribution of protons between states of negative 
energy. 

It is difficult, however, to accept the idea that space is filled up with 
one or two sorts of particles forming a kind of infinitely dense ‘ether’ 
which is revealed in a negative way only through the occasional absence 
of the full quota of these particles. 
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Dirac’s equation has served as a starting-point for the introduction— 
besides positive electrons—of particles devoid of electrical charge and 
denoted accordingly as 'neutrons’. Dirac himself attempted in 1931 to 
introduce neutrons as magnetic analogues of electrons, i.e. as particles 
possessing a magnetic charge instead of an electric one. Pauli on the 
other hand proposed (simultaneously with Dirac) a theory of neutrons 
devoid of charge (both electric and magnetic) but possessing a magnetic 
moment and a spin angular momentum associated with it. The necessity, 
or rather plausibility, of introducing neutrons in addition to protons 
and electrons as constituent parts of atomic nuclei was dictated by 
certain nuclear phenomena, like the apparent failure of the alterna¬ 
tion principle (Bose-Einstein statistics holding for nuclei supposed to 
consist of an odd number of particles) and of the principle of conserva¬ 
tion of energy (continuous /bray spectra of radioactive substances). 
These difficulties could be removed by admitting the existence in the 
nuclei of a third sort of elementary particles in a bound state. The idea 
of treating these particles as ‘magnetic neutrons’ was suggested by the 
possibility of replacing Dirac’s equation for the electron by a similar 
equation with e = 0 and with the mass m Q increased by an additional 
term 

L = m (H-5-Eij) 


which represents the action of the magnetic and electric field on the 
neutron’s magnetic and electric moment (% and rj being the matrices 
(275 b) and (275 c), and /x hypothetically Bohr’s magneton). Pauli’s 
equation for the neutron can thus be written in the usual form 



= 0 with 


e = cyp+y 0 (m 0 c 2 +L), 


where p = —. V; the electromagnetic potentials A and <f) do not appear 

2lTl 

in c since the electric charge with which they must be multiplied is sup¬ 
posed equal to zero. 

We shall not stop here to develop Pauli’s theory. The remarkable 
fact we are mainly concerned with is that the neutron was discovered ex¬ 
perimentally by Chadwick, following observations by Curie and Joliot, 
within a year after its existence had been tentatively admitted on theo¬ 
retical grounds. It made its appearance as the disintegration product of 
certain nuclei bombarded by protons or a-particles in the form of a 
particle with a mass very little different from that of a proton (while 
Pauli expected it to have a mass of the same order of magnitude as the 
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electron). It is still a matter open to question whether a neutron is 
a simple particle like an electron and a proton, or a combination of 
both.f The latter alternative seems the more natural, although we are 
not yet in a state to substantiate it theoretically, for the present wave- 
mechanical theory is inadequate in treating such systems, whose linear 
dimensions are of the same order of magnitude as the ‘size’ of the 
electron (attributed to it on the electromagnetic theory of mass). As 
to the forces binding the electron and proton in a neutron more 
tightly than in a hydrogen atom—they may be due to the mutual 
attraction of the spin magnetic moments. In fact this attraction (which 
corresponds to a suitable orientation of the spins) increases with de¬ 
crease of distance much more rapidly than the attraction due to the 
electric charges of the two particles, so that the Coulomb attraction 
becomes negligibly small (relatively) at distances of the order of 
10~ 14 cm. It cannot be asserted, however, that the usual inverse fourth- 
power law for the mutual attraction of two elementary magnets is 
applicable for distances comparable with the electron’s own dimensions. 


35. The Invariance of the Dirac Equation with regard to Co¬ 
ordinate Transformations 

We have hitherto considered the Dirac equation of motion for a parti¬ 
cular frame of reference specified by the coordinates x , y , z and the time t. 
We shall now investigate the transformation properties of this equation 
for such transformations as correspond to a rotation of the coordinate 
system x, y , z in space, or more generally to a Lorentz transformation of 
the coordinates and the time (i.e. to a rotation of the original frame 
in a four-dimensional space-time manifold). 

We shall first w r rite down the Dirac equation in the form of two 
two-dimensional matrix equations 

o u x/,+(u l -m. 0 c) x = 0 | (3Q2) 

o ux+(v,+m 0 c)<p ----- 0 j ' 

[cf. (257 a), § 30J and limit ourselves to rotations in ordinary space, 
which do not affect the operator u t . The invariance of equations (302) 
with regard to such rotations can be achieved in two different ways: 

(1) By considering the wave functions (matrices) \jj = |^ x | and 
^ = |^J as invariant and the matrices a x , <j yi a z as covariant, i.e. 

t It might also be surmised that the proton is a complicated particle formed by the 
combination of a neutron with a positivo electron. 
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transforming according to the same law as the coordinates x,y,z. Under 
this condition the product a u = o x u x -{-v v u y -\-cr s u z will define a scalar 
(invariant) operator. 

(2) By considering the matrices cy, o y , c z as invariant numerical 
operators, and introducing a suitable transformation for the matrices 

X- 

The two methods must, of course, give equivalent results. In the 
first case we can define the matrix a tt for any direction n (which may 
be that of one of the new coordinate axes) as the projection of the 
vector a in this direction. Using the polar angles 0 n ,<f> n to specify it 
with respect to the original coordinate system C(x,y,z ), we have 


a n = o x co8(x,n) J rc y cos(y,n)-\-a s co$(z ) n) 

= sin 6 n {a x cos <f> n +a y sin <j> n )+a z cos 6 n , 
which is equivalent to four equations for the matrix elements o H0L p 
(a,/? = 1,2) of a n . With the help of the expressions a x = M, 


| ^ *|, a z — * ^J, defining the rectangular components of 


in the system A, we get 


[ — cos0 w 
sin 6 n e-rt* 


sin 6 n e i M 
cos 6 n j’ 


(302 a) 


This equation can be applied for the definition of the matrices a x >, oy, 
which represent the rectangular components of the vector o with 
regard to a new coordinate system C'(x\y',z'). 

We shall not, however, write down the explicit expressions for these 
matrices (which can easily be found with the help of the three Eulerian 
angles), but shall limit ourselves to presenting the general transforma¬ 
tion equation in the form 

Kafi = I a mn (302 b) 

m = l 


where the indices (m,n) — 1,2,3 stand for the three axes of the old 
and the new system respectively (a[ —- v x/ , etc.), while a mn is the matrix 
of the orthogonal transformation C -+ C': 

x ' n = 1 a mnXnr 

It should be emphasized that the indices w, n which specify the 
coordinate axes or the rectangular components of a, have nothing to do 
with the indices a, /3 which specify the matrix elements of a or of its 
rectangular components. 

The transition from the first method (of transforming a m ) to the 
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second method (of transforming and yj can be carried out in the 
following way: 

We try to find a unitary two-dimensional matrix A such that the 
transformation defined by (302 b) shall be equivalent to the following 

° ne: <j^A-i„ n A (A-' = A') t (303) 

that is, 2 2 

a nocP = 2 JE ^yat^SP^nyS ^ ~ ^ 

y -1 8 - 1 

involving a component of a along a given new axis and along the 
corresponding axis only of the original coordinate system. 

The relation between the transformation (302 b) and (303) can be 
stated as follows: in the former the matrices o m (or o„) appear as com¬ 
ponents of a vector in ordinary three-dimensional space, whereas in the 
second case they appear as tensors in the two-dimensional spin-space 
specified by the Greek indices a, p, etc. The transformation matrices 
a mn and are both unitary and refer respectively to the ordinary 
space and to the state-space. 

Let us suppose that we have succeeded in finding A and let us write 
the scalar product o u in the form 

2 ° m u m = 2 « = 2 A-'a n Au' n = A- l {% <J h K)A. 

n n 

(A commutes with u' n since the latter is a scalar in the state-space.) 
The transformed equations (302) can be written accordingly in the form 

A- 1 (2 i a n<) A 'l>+{u t -m 0 c)x = 0, 

A-H2, ^ n K) A X+(u,+m 0 c)i/ l = 0. 

Multiplying them on the left by the matrix A , we get 


(2 a n U n) l P' + ( u t- m 0 c )X = 0 
(2 °n<)x +( u l+ m 0 c )’l'' = 0 


(303 a) 


with the operator-matrix ou' of the same form as in the original 
coordinate system and with the transformed wave functions 

1 P’= A<f>, x = A x- (303 b) 

We shall determine the transformation matrix A for the simple case 
of a rotation in the (x, y)-plane through a given angle (in the direc¬ 
tion from x to y). This gives 

x r = a:co8^+ysin<£, y ' = —xsin^+ycos^, 


s' = Z, 
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and consequently 

a x > — a x cos <j> + o.y sin <f>, — —a x sin cos <f>, a’ z > 

that is, 


1 ' / 

0 


, ( 

-1 


)’ v= l 

—ie~^ 

o)’ 


0 

i) 


Now we must have, irrespective of the index n> 

O’n^- A(J. ni 

and in particular for n ~ 3, <j £ A = Ag s , that is, since a s is a diagonal 
matrix, , \ a n 

whence it follows that A mpst also be a diagonal matrix. Putting 
A = H 1 ? L we get further 


Ui 0/ \A 2 e~ 


that is, A 2 — A x eb, A 1 — 

or consequently A x = ce _li ^, A 2 == ce +i ^. The same result is obtained 
from the equation o y A = Aa y >. The constant c is determined by the 
condition that the determinant of A (a unitary matrix) is equal to 1. 
We thus get c = 1 and finally 

^ = P q e + ^j = cos sin \<f>, (304 a) 

(the first term being understood to be multiplied by the unit matrix 8) 
which corresponds to the following transformed expressions for the 
functions ip, x : 

i/i[ = ip 1 e~ i ^, = ^ s e +< *^; xl = X 2 = Xt e ^- (304b) 

For a rotation in the plane x, z through the angle 6 (in the direction 
from z to x ), i.e. for the transformation 

a x > — g x cos $ — cr z sin 6, oy = o y , a', — cr^sin tf-f^cos 6, (305) 

, _ fsin# cos 6 \ , __ / 0 i\ , _ (—cos 9 sin 6\ 

° T ° x (cos# — sin#/’ ° v> (—i 0j’ ° z { &md cos0j’ 

we get in a similar way 

An ~ ^ 22 > ^12 ~ *^21 

(from the equation o y A — Aa y ) f and further, from a x A = or 
a t A = together with the condition |41. — 1: 


^ _ fcos£0 —sin 4^1 ___ 
~ [sini# cos£0 


cos |0+icr y sin |0, 


(305 a) 
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whence 

ifj[ = ^cos^—^ 2 sin^0, ipz — ^/qsin J0+^ 2 cos|0, 

xl = Xi cos ^—X2 s i n i^ Xi = XiSin|0+y 2 cos|0. 

It should be mentioned that the transformation matrices (304 a) and 
(305 a) can be written in the form and respectively. We 

have in fact, by the definition of the exponential function 



= cos /x -\-i<J n sin /x. 


since q* “ °> t == ... =-= 8( =■ 1), o» = cr* = ... = a n . 

With ft = |</> and cr n = (7 Z this gives (304 a); with /x — -|0 and cr n = a ?/ , it 
gives (305 a). 

Two successive rotations arc obviously equivalent to a single one, 
specified by a matrix (a" or A") which is equal to the product of the 
matrices (a,a' or A, A') specifying the two component rotations. Thus, 
for example, by combining the two preceding rotations in the order 
stated, we get a rotation with the transformation matrix (in the state- 
space): 

. „ _ (cos \0 — sin 0 \ __ /cos iOc~ L ^ —si 

(sin-10 cos \6 j( 0 e'^j (sin 10e- i4 ^ cos Ide 1 ^ /’ 

which can be written symbolically in the form 

£iiQo v __ gi\(<f>Oz+9<J v ) 

with the understanding that the order of the two factors should not 
be inverted. 

This means that to a coordinate transformation defined by the equations 
x" — (a* cos </>+y sin <f>) cos 0—z sin 0 
y" — — xsin<^-f2/cos</» 
z" = x sin 0+z cos 0 

there corresponds the following transformation of the functions 0: 
ip" = l fj 1 cos 10sin 16eAt, i/i 2 — *Ai sin 16 e~ i ^-\-'p 2 cos 2 ^ 
and a similar transformation of xv X2- 

The preceding results are easily generalized for any number of suc¬ 
cessive rotations about arbitrarily chosen axes. These rotations are 
always equivalent to a single rotation over an angle o> about an axis 
specified by a unit vector n. The transformation matrix A correspond¬ 
ing to such a rotation is easily seen to be 

A = cos ia>+ic7 n sin — e n - 0>a », 
z z 
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where o n = a*n = n x a x +n y o y -\-n 2 a e is the component of a along the 
axis of rotation. The reciprocal matrix 

A" 1 = cos )>a) — ia n sin ^o> 

corresponds to a rotation about the same axis in the opposite direction 
(or to a rotation about the oppositely directed axis —n through the 
same angle); it obviously coincides with A f since o* = a. Hence it 
follows that A is a unitary matrix, as was assumed at the beginning. 

A two-dimensional unitary matrix can be represented with the help 
of two complex numbers a, satisfying the condition aa*+$9* = 1 in 
the form 

A = 

In the present case these numbers are 

a — cos \w+in z &m j8 — i(n x +in v )sin W 
It should be mentioned that the number of real independent parameters 
which determine the rotation is equal to three (the rotation angle a> and 
the two angles 6 , <f> which determine the direction of the axis of rotation 
n, or three of the four real numbers which define a and j8 under the 
condition aat*+PP* = !)• 

As has been shown in § 30, the probability density and the rectangular 
components of the probability current density are expressed, with the 
help of the two-dimensional matrices tf*, a, by the equations 

p = W+X'x* jn = c ('l>' a nX + x' a n'l’)> 

[n = 1,2,3; cf. eqs. (259) and (259 a)]. Transforming the functions and 
X according to the equations 0' = A0, i/j' 1 — ^A\ and regarding the 
matrices a n as invariant, we obtain for the same quantities referring to 
the rotated system the expressions 

p = ip^A^Aili+x^A^Ax — = P 

(since A 1 = A- 1 ), and 

jn = cbl>HA'o n A)x+xHA'a n A)il>] = c{tl>'o^x+x' a h'l>)’ 

” 2 a mnjm> 
m 

in agreement with the invariant character of p and the covariant 
character of the components of the vector j. 

The preceding results are easily extended to the four-dimensional 
matrix form of the Dirac equation and of the associated operators. 
Taking, for example, the energy operator 

8 

€ = U+€ 0 y 0 +c 


/ «. p\ 
1 - 0 *, «*/■ 
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we can consider it as an invariant with regard to rotations in ordinary 
space if the three four-dimensional matrices y x — y x , y 2 ~ y y} y 2 ~y z 
are defined as covariant operators, satisfying the same transformation 
equations as the coordinates x 1 = x, x % == y, x z — z or the components 
of the operator u. The shape of the transformed matrices y n is easily 
obtained from the above expressions for the transformed matrices o' n 

with the help of the invariant relations y n — 

The same relations can serve for the determination of the unitary 
matrices, L say, which determine the equivalent transformation in the 
four-dimensional spin-space according to the ‘tensor’ law 

Yn -■ L ~^Yn L = L ( M = J » 2» 3 )- 



We have, namely, 
where A 


A 0 
0 A 


(306 a) 


A A A 

11 12 5 is the two-dimensional unitary matrix defining 

A 2 i A 22 | 

the transformation of cr n ^0 = |^ jjj j. 


With the help of the matrix £ = 


which serves to describe 


the electron’s spin or magnetic moment [cf. (277)] we can write the 
matrix L corresponding to a given rotation (a>,n) explicitly in the form 
L -- = cos |oj+^ n sin (306 b) 

similar to (306) with o replaced by 

The matrix y 0 remains invariant under this transformation. Writing 
Dirac’s equation in the form (e+Pt)'l J = 0 and using equation (306) for 
the y' t , we can write it for the rotated coordinate system in the form 

\(Pr\-U+e 0 y 0 )+c2L- 1 y n Lu' n ] i <lj — 0 , 

or since (p t +U+e 0 y 0 ) = L- l (p,+ U+e 0 y 0 )L, 

e 0 7o+ c Z Vn w n] = 0. 

If this equation is multiplied on the left by £, it reduces to the original 
form, with the old matrices y n > the new components of u, and a new 
wave function «/»' derived from the old one by means of the trans- 
formation ^ = jr^ 


Putting 4‘ — j^J, where <f> — j^j and \ — |**J> we get, with the help 
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Of (306), 



in agreement with the results obtained before. 

It can further be shown directly that under the transformation ip' = Lip 
and \fi^ — ip 1 & the product ip^ip remains invariant while the quantities 
op f y n ip transform as the rectangular components of a vector. 

We can now turn to the generalization of the preceding results for 
rotations in the four-dimensional space-time manifold of the relativity 
theory, i.e. for Lorentz transformations, corresponding to a transition 
from a state of ‘rest’ to that of uniform motion. 

It will be convenient in this connexion to use Dirac’s equation in 
the form Bip = 0, i.e. 


(Px u x+Pv u v+Pz u z+Pi u t+ m O c )'i> = 0 . 

or (2PnU n +m 0 c)t = 0, (307) 

where n — 1,2,3 stands for x , y, z respectively, while 

= •%/— let, = — V —1 Uf, — 1 ^. 

It must be emphasized that the imaginary unit V— 1 is introduced here 
simply for the sake of formal symmetry, and that it will be treated in 
the sequel as an ordinary ‘real’ number, in the sense that its sign will 
not be altered in a transition to conjugate complex quantities. In order 
to distinguish this relativistic V— 1 from that of the quantum theory, 
which plays an essential role, we shall denote the relativistic V— 1 by 
the Greek letter i (t* = i, i* = — t). 

A Lorentz transformation is defined as a linear transformation of 
the form 4 

= 2 

m=1 

4 4 

satisfying the orthogonality condition 2 x n — 2 4 an( ^ the condition 

71*1 7» = 1 

that the first three components of x f should be real and the fourth 
imaginary (reality condition). The components of the four-dimensional 
operator u are transformed in the same way as the corresponding 
coordinates, and if we wish to ensure the invariance of equation (307), 
we must either submit the matrices p n to the same Lorentz trans¬ 
formation 4 4 

ft = 2 «mn ft, = I <*mnft, ; ; J, (307 a) 

1 ' m=* 1 1 

or introduce the equivalent tensor transformation in the four-dimen- 
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sional state-space 

P n = K'fS n K = (307b) 

With the help of the latter the transformed Dirac equation can be put 
in the form 

= 0 , 

that is, ( 2 j8»<+m 0 eU' = 0, 

'» = 1 

with the same numerical matrices j3 n as the original ones and with the 
transformed wave function 

ifj' — Ktfs. (307 c) 

The possibility of replacing (307 a) by (307 b) is proved by the fact 
that the transformation matrices a mn (in the ordinary space-time) and 
(in the four-dimensional state-space) have the same rank. They 
contain therefore the same number of elements. 

The determination of K through a can be carried out in the same 
way as in the case of rotations in ordinary space, by combining rota¬ 
tions in different planes. 

In the case of rotations in ordinary space the matrix K must ob¬ 
viously coincide with the matrix L considered before. This follows from 
the relations fi n = y 0 y n for n = 1, 2, 3 (/? 4 = iy Q ) in conjunction with 
the fact that y 0 is not affected by a spatial rotation. Now for a rotation 
through an angle a> in the plane (x v x 2 ) we have, as has been shown above, 
L = e iiui & or, since == [according to (276), § 31], L = 
Identifying this with the matrix K for the case under consideration 
and taking into account the relativistic symmetry of Dirac’s equation 
in the form (307) with respect to the space coordinates and the time 
(id), we can define the matrix K corresponding to a transition from 
a state of rest to that of a motion in the direction of the first axis with 
a velocity v by the expression 

K = 

corresponding to a rotation in the plane (x lt r 4 ) through the imaginary 
angle # = tan ~ l v/ic. Replacing here & by y 0 y v by iy 0 , and putting 
# = where 

tanhfl = - (coshfl = sinhg = 17 ^ ) 

c \ vi 1 -*/ 0 ) Vi 1 -*/ 0 )/ 

we get, since y 0 yi7o = —ftyi = —Yv 

K = 
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This result is easily generalized for the case of motion in any direction 
specified by the unit vector n'. Denoting the corresponding component 
of y (i.e. the scalar product y-n') by y n > we get 

K — e-i 0 ?*’ = cosh W—-)vsinh 19. (308) 

In order to find the corresponding expression for the matrix L we 
must come back to that form of Dirac’s equation which has been used 

hitherto, viz. y n u n -\-y 0 m 0 cjip — 0 withy 4 — *8 and u 4 — — i u ( , where 

the factor i is introduced in order to secure a more complete symmetry 
between the terms involving the space coordinates and the time. The 
Lorentz transformation of the components of the operator u , defined 

4 

by the equations u' n = a mn u m , must be combined with an appropriate 

7 / 1=1 

transformation of the wave function, ip f = Lip, so that the transformed 
equation shall reduce to the form Yn ^n+To w o c ) i /'' = 0 with the same 

matrices y n (including y 0 ) as the original one. Replacing ip and ip' by 
y 0 tP = ^ and y Q ip' — ip' respectively, we come back to the equations 
(1 = 0 and QT £„<+m 0 c)<J' = 0; 

whence it follows that L = y 0 1 A"y 0 , where K is the transformation 
considered before. Since yl — 1, i.e. y 0 — y^ 1 , we can put L — y 0 Ky 0 . 
Substituting here the expression (308) for K , we get 
A = yjj cosh y 0 y n 'y 0 sinh Id 

or, since yj = 1 and y 0 y n >y 0 = —yly n • == —y n '» 

L = cosh W+y /t 'Sinhi0 = (308 a) 

If y is replaced here by irj, where yj is the matrix which serves to define 
the electron’s electric moment in the same way as % defines the magnetic 
one [cf. (276 a)], L assumes a form quite similar to that (306 a) which 
corresponds to an ordinary spatial rotation. It should be remembered, 
however, that while £ represents a real quantity, yj must be considered 
as a pure imaginary. This corresponds to an important distinction 
between the matrices (306 b) and (308 a), the former being unitary 
(A f = L~ l defining a rotation in the opposite sense) and the latter 
Hermitian ( L f = L). 

In the general case of a Lorentz transformation combining an 
ordinary rotation (<u,n) with a relative motion (0,n'), the matrix L can 
be represented as the product of the two component transformations 
taken in a definite order, for instance, 

L ~ e iiw f"e i0 y”'. 


(308 b) 
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The adjoint matrix is 

& — 

so that 

&L = cosh 2 £0+sinh 2 ^0+2y n ,sinh£0 cosh £0 — cosh0-fy n >sinh0. 

Substituting this expression in the formula p — ip^L 1 Lip for the trans¬ 
formed value of the probability density (in the ‘moving’ coordinate 
system), we get 

p = ip^ip cosh 6+ip*y n - j/fsinh0, 

that is, 


p f — pcosh#+>vsinh0 = 


P+3n' V / C 


V(l-« 2 /c 2 ) 

in agreement with the well-known result following directly from the 
Lorentz transformation equations. 

If the moving axes are parallel to the original ones (co — 0) we get 
in a similar way from the general formula j n = ip^y n ip' = ip^L^y n Lip 

j' n ' — ^r t [y n .(cosh 2 ^l9-|-sinh 2 ^) + 2co8hpsinhp]0, 

that is, • , . 

j'. = J n - cosh d+p 8inh 6 = 

It should be mentioned that instead of introducing the relativistic 
imaginary i = V-i in the definition of the fourth component of four¬ 
dimensional vectors one can distinguish two types of real components, 
namely, the covariant and the contravariant , the latter differing from 
the former by the opposite sign of the fourth components. The contra¬ 
variant components are denoted by the same letters as the covariant 
ones with the index placed above instead of below. If, for instance, 
x x = x, x 2 — y , x z — z y x x — ct are the covariant components of the 
space-time vector, then its contravariant components must be defined by 
xP* = x , xt 2) = y> xt 3) = z, a^ 4) =* ~ct. The square of a four-dimensional 
vector, A say, is thus equal to the sum of the products of its covariant 
components with the corresponding contravariant ones: 

A* = 2A k A k . 

n-1 

In a similar way the scalar product of two vectors is defined by the 

4 4 

sum y A k B k or A k B k . With this notation Dirac’s equation can be 

kZ l k=> l 

written in the form 


[2 ■/ k) ^k+Yo mc ]<l ) = 0 , 
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where w 4 = u t —- ^ + e<^ anc ^ ^ ~ 8 (~ *)• The covariant com¬ 

ponents of the four-dimensional velocity vector y must be defined 
accordingly as 


Yi = Yx> y% = y u > Ys = y*. y 4 ^ —8 ( : " = - 1 ), 

and the covariant components of the operator u as 

u ' (1) = u x , u (2) — u yy u (2) = 2 /., w (4) = ~~u t . 

The transformation matrix L obtained above thus refers to the 
contravariant components of y. It is easily seen, however, that it can 
be applied just as well to the covariant ones. 

Quantities of the type of Dirac’s wave function quadruplet ip v ip 2i 
tfj 3 , ipt can be regarded as forming in the space-time manifold a kind 
of tensor of rank This means that they are related to an ordinary 
vector (i.e. tensor of the first rank) in the same way as the latter is 
related to an ordinary tensor (of the second rank). This connexion is 
plainly seen from the fact that an ordinary vector—like the probability 
current density (j,p)—can be expressed with the help of the ip's as a 
quadratic quantity—just as a tensor (of the second rank) can be 
expressed as a quadratic quantity by means of the components of a 
vector or of two different vectors. 

It has recently been shown by various authorsJ that each of the two 
pairs of functions i/q,</f 2 (== ^ 1 ^ 2 ) anc ^ ^ 3^1 (= XVX 2 ) rather than the 
whole quadruplet determines a ‘tensor of the rank Any pair of such 
quantities, whose transformation properties in the state-space of the 
spin coordinate (with its two values 1 and 2) are connected with the 
transformation properties of vectors in the ordinary space-time mani¬ 
fold by the above equations, are called, following Ehrenfest, a spinor. 
The two components of a spinor, </> x and (j> 2 say, are complex numbers; 
they determine therefore four real numbers which can serve to specify 
the components of an ordinary four-dimensional vector. A vector can 
be defined as a particular type of spinor of the second rank, i.e. as 
a quantity whose components (in the spin space!) transform like the 
products of the components of two ordinary spinors, or in particular 
of a single spinor <f> and its adjoint quantity ft. 

It can easily be shown, for example, that the expressions 
f k = ftc k <f> (k — 1,2,3,4), 

°i “ a z’ a z ~ °i/> °3 ~ cr i = 8 (=1), 


% Cf. O. Laporte and G. Uhlonbeck, Phys. Rev. 37 (1931), 1380. 


where 
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that is, f x = <ft4> 2 +4>S(f>i, h = i(<l>*<f> 2 -<l>%fa), / s = a nd 

/ 4 — transform like the quantities ar, y, 2 , c/ in any ordinary 

rotation or in a Lorcntz transformation, if </> is transformed according 
to (/>' ~ A<f> and </>* according to <j> 4 being a two-dimensional 
matrix, which reduces to the form e , ’* wa » = cos |cu+cr M sin \<o already 
considered in the case of an ordinary rotation (through the angle to 
about an axis n). In the case of a relative motion in a direction n' with 
a velocity v specified by the angle 6 = tanh -1 (r/c) we have, so long as 
the new axes are parallel to the old ones, 

A = e i0a "' = cosh oy sinh Id (o n > = n , x o x -\-n , u a jJ -\-n , z a z ). 

This gives in particular, for a motion in the 2 -direction, 



In the most general case A can be represented by the product of eJ^ toa >‘ 
with e* 0a "', that is, 


A — cos \to cosh \0~\~cr lt sin icucosh 10-j-or, r sinh 10cos|a;+ 

+ o„ a n • sin \to sinh 

or, since a„ a n > — (o-n)(o-n') — n-n'+ia*(n xn'), 

A — cos .l^cosh 10+n-n'+o’ /t sin |a>cosh .]^-)-o’ /t sinh \6cos \to~\~ 

-f ia -(n xn')sin J to sinh 10. 

The elements of this matrix are easily verified to satisfy the relation 
\A j == ^11^22 “^12*^21 ~ h 

Using the notation 

<t>t = fa, i* = 


i.e. replacing the conjugate complex sign by dotted indices, one can 
write the covariant components of a spinor of the second rank in three 
different forms, namely, 


4>kb 4* kb 4>ki (A*,/ =1,2), 

these components transforming as the products <f> k <f> t , <f> k <t>i , and tj> k tf>\ 
respectively. 

Besides covariant components of spinors we must also distinguish 
contravariant ones. For a spinor of the first rank these are defined by 
the relations = ^ = 

i d) = 4>b 4> (i) = -fc. 

because this ensures the invariance of the ‘scalar products’ <f> x x (1) +</> 2 X <2) 


3595.6 3 A 
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The contravariant or mixed components of spinors of higher rank 
are connected with the covariant ones in a similar way. We have, for 
example, 

<£ 11 = <^ 22 , (f > 12 ~ —(f> 2 i, (f) 21 — — </> 12 , (f> 22 = etc. 

The components of a (four-dimensional) vector / can be represented 
with the help of a spinor of the second rank by the formulae 

Z 1 = A — Kfei+faz), f 2 — ~ $ 12 ) > 

f 3 fs — / 4 = f\ = 

We shall not engage in a more detailed discussion of this question 
and shall point out in conclusion the following important circumstance. 
In our derivation of Dirac’s equations as a generalization of the 
equations of Maxwell’s theory we originally introduced, instead of 
the quadruplet (/r 2 ,0 3 , </r 4 , eight quantities M v M 2 , M 0 ; N lf N 2 , 

N 3 , N 0 , visualizing the six quantities M v M 2i M 3) —N v —iV 2 , — N 3 
as analogous to the electromagnetic field components H x , H y , H z , 
E x , E yy E z , while M 0 and N 0 were regarded as two additional scalar 
quantities. This point of view had to be abandoned in the sequel 
because of the rearrangement of the Maxwell-like equations, corre¬ 
sponding to the introduction of the additional terms containing the 
rest-mass of the electron m 0 . If, however, instead of the first-order 
equations we consider the second-order equations only (which are 
a generalization of the d’Alembert equations of the electromagnetic 
theory), we can preserve the above point of view and treat the quantities 
M v M 2 , M 3 , iN v iN 2 , iN 3 as the components of a four-dimensional 
antisymmetric tensor of the second rank M kl = — M lk (Jc,l = 1,2, 3,4) 
transforming under a Lorentz transformation in the same way as the 
components of the electromagnetic field-tensor F n i — F ln (F 23 — H v 
E 3 i — H 2) F 12 = H 3i = iF}, 1^4 ^ — iF 2 , F 34 == — iF 3 ). It has 
been shown further that in this case we can put N — ± iM which 
corresponds to the ‘self-duality’ of the tensor M u and introduce accord¬ 
ingly the relation N 0 — ±iM 0 between the scalars (invariants) M 0 and 
N 0 , thus reducing the eight quantities M, N to four, just as in the case 
of Dirac’s equation. 

The fallacy of this procedure is shown by the fact that it does not 
permit us to define a four-dimensional vector representing the probability 
current j and density p. The latter would appear in such a theory not 
as the fourth (time) component of a vector but as the (4,4)-component 
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of a tensor of the second rank, corresponding to the tensor of the electro¬ 
magnetic energy and momentum; the components of the vector j 
would appear likewise as the (1,4), (2,4), and (3,4) components of 
this tensor, corresponding to the components of the energy-stream. So 
long as we confine ourselves to ordinary rotations in three-dimensional 
space this circumstance remains irrelevant; it becomes, however, a 
challenge to the theory when we pass to the more general Lorentz 
transformation, involving the transformation of the time. In order to 
make j,p a regular four-dimensional vector we must consider the 
quantities M, N as defining a spinor <//—or more exactly two spinors 
<f>, x —whose transformation properties have been studied in this section. 

The above argument serves to show in a most convincing way the 
restricted character of the analogy between matter and light as repre¬ 
sented by the probability and the electromagnetic waves respectively. 
A ‘wave-mechanical’ theory of light similar to that of matter would 
necessitate the introduction of a new type of probability ft' id, con¬ 
nected with the photons in the same way as with ordinary particles 
and entirely different from the electromagnetic field which has been 
used hitherto to describe the phenomena of light from the point of view 
of the wave theory. It does not seem, however, that the introduction 
of such a probability field with spinor properties is warranted by the 
experimental facts. 

36. Transformation of the Dirac Equation to Curvilinear Co¬ 
ordinates 

We have considered hitherto cartesian coordinates only. We shall now 
generalize the results obtained for a transformation from the cartesian 
system x , y, z to any system of orthogonal curvilinear coordinates q 2 , q 2 . 
Such a system can be specified by the following expression for the 
square of the line-element (i.e. the distance between tw o neighbouring 

P ° intS,: ds 2 = ef dqi+e:\ dqi+ej dq jj, 

where e 1? e 2 , e 3 are mutually perpendicular vectors tangential to the 
coordinate lines which pass through one end of ds. The products e t dq i 
play the same role as the differentials dx { of a local cartesian system 
passing through P with its' axes parallel to the vectors e 4 , so that the 
rectangular components of the operator p can be written in the form 
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In transforming the expression y*P*A — (2 YkPkj'l* to the new co¬ 
ordinates, with the help of the formula y k = L~ l y k L we must take into 
account the fact that the matrix L is to be considered as a function of 
the coordinates , varying from point to point with the direction of the 
local cartesian axes. We thus get 

Y'P’A = L ~ 1 (lLYk L Pk)<l' 

= YkPkl-4 — 1 Vkip'kL—Lp' k )<]>], 

or YP <l> = 2 y J L p' k —(p' k L—Lp' k )L-' l ]I4. 

= 1 

In order to obtain the transformed equations we must accordingly 
replace the components of the vector p by the ‘co variant’ operators 


p'k = p'k-(p'k L - L Pk) L ~ i - ( 3 ° 9 ) 


In the special case of orthogonal coordinates they assume the form 


n 


h 1 / 8 


(±_JLiog4 

A Hk 8<lk * J 


where 


e k \dq k 

--log£ = — L~\ 
d <lk fyk 


(309 a) 


Now, as has been shown above, the matrix L can be defined by the 
expression e iio> £* n , where the rotation angle co and the axis of rotation n 
must be considered as certain functions of the coordinates. We thus 
have 


logL — tycoon — 


(309 b) 


the vector «*> serving to determine the rotation both with respect to 
magnitude and direction. 

Let us consider the infinitesimal rotation dw corresponding to a 
transition from a point P (with the coordinates q k ) to a neighbouring 
point P'(with the coordinates q k = q^+dq^. 

Introducing three unit vectors i k — e k /e k in the direction of the 
coordinate lines, we can obviously put 


whence 


df k — dxD Xf k , 


f^df k = fi(dwxf k ) = dw-tffcXf*) = = ±(d*a) p 

where fj is the unit vector perpendicular to i k and f i5 the positive sign 
corresponding to an even character of the permutation an< ^ ^e 

negative sign to an odd one. 
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We have on the other hand 

ti-dl k = [e t le t yd(e k je k ) = e i -rfe A ./(e j e k ), 

since the vectors e* and e k are mutually orthogonal (i k), whence, 

3 

putting for the sake of brevity — d hi 

d 9h 

i = e i'dh e kl( e i e k)- (310) 

It follows from the formula 


dT — tyll “f* 

which can serve for the definition of the vectors e i} that the latter are 
equal to the differential coefficients of the radius vector r of the 
point (q 1 ,q 2 ,q 3 ) with respect to the corresponding coordinates. We 
thus have 

$h e k = dk e h 

and consequently 

e* = e i -8 k e c = $d k {e { -e ( ) == \d k e\ = e i d k e { . 

Further, since e t *e fc — 0 (k ^ i ), 



e k' d i e i = ~~ e i'di e k = ~~ e i d k e i> 

and if h is different from both k and i, 

*k’ d h e i = *i' d h*k = o. 

The latter equation is easily obtained in conjunction with the fact that 
dh( e k e i) “ 0 . 

Substituting these expressions in (310) we find 


(^ 2 to)i — — e 2 , 

— — ^2 e 3> 


— “ ^3 C 1» 

e a 

(a 2 w ) 2 = o, 

(^gO>) 2 = — ^1^3, 


(^lCO)g — — — ^2 C 1 
e 2 

(d 2 u>) 3 = —- 6 2 

e l 

(^3 w )3 = 0. 


Now according to (309 a) and (309 b) 


2 


Vh^k — 



(310a) 



that is, 
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The first component of the vector V — 8,.u> (i.e. its product with 

/ -"^k ' 

fj) is 

+y 2 --(c , 2 w )i+y37( s 3 t °)i = - y, ~ e 3 log e 2 + y 3 i- d 2 log e 3 

e l e 2 e 3 e 3 e 2 

according to (310 a). Multiplying the right-hand side by we get, since 

li y 2 = pil l 2 = vis = *73 and lx 73 = pli £» = — Pt 2 = —72- 

“ » [737 « 2 +72 j l°g eJ ■ 

L e 3 ^?3 e 2 c 2 J 


We thus find 


5 - 2 7*7 
* ^ 

= -*[^17 7T lo g( e 2«3)+y27 T^-log(6 3 ej)-|-73 7 r^- lo g( f i 

I ex <tyi e* dy, e 3 8q t J 


and consequently 


r p ' = 2 y*’ 2 74 log/- 1 -- 2 -?) , 

i._i L ' 'k f - 


it being understood that the second term in the brackets represents an 
ordinary number and not an operator. We can also write 

Y ,p ' = 2 n^[ 1-log ’ (311a ) 

the transformed Dirac equation being 

jpi+eQ+CY^P' — ?Aj+y 0 wi 0 c 2 ji/>' - 0. (311b) 

Two special cases should be especially noted, namely, that of a 
cylindrical and of a spherical coordinate system. In the former case 
we have, putting 

?i = r = V(* 2 + 2 / 2 )> ?2 = </> (angle), and ?s = 2 . 
e i = 1 . e t = r, e 3 =l, 

and consequently 


that is, 


rr ‘~ sa[ n (s-k) +r 4^ +r i} 
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In the latter case, putting 

q 1= =z r = *J(x 2 +y 2 +& 2 )> q 2 = 9 (colatitude), and g 3 = 
we have e x — 1, e 2 = r, e 3 — r sin 9, 

and consequently 

rP> =^[ y ‘(l-^ log{ ^ (8 “^) +y 4(^-l logV(s “^ + 

+ %8o(^-^ l0gVr )]’ 

that is, 

* p ' = sWIr^'dr ! H+’Vars %} < 3,2 *> 


This expression can be used to reduce to its simplest form the problem 
of the hydrogen atom, which has been discussed already by a less 
straightforward method in § 33. 

It should be mentioned that in calculating the product y*A = 2 7k^k 
the quantities A k must be understood to represent the components of 
the vector potential along the axes of the local cartesian systems, i.e. 
along the vectors e v e 2 , e 3 (A k — A-f k ). The matrices y v y 2 , y 3 , though 
identical with the original matrices y x , y v , y z , have now a different 
physical meaning, denoting the components of the vector y along the 
axes of the local system and not of the original cartesian system of 
coordinates. 

The preceding results can be further generalized for the case of a non- 
orthogonal system of curvilinear coordinates. We must distinguish in 
this case contravariant and covariant components of different vectors, 
the former transforming* as dq v dq 2} dq 3 and the latter as d/dq^ djdq 2 , 


h $ 

d/dq 3 . Putting p' k — — -— and denoting the contravariant components 
2nt cq k 

of the vector y in the new system by y' k , we can write the operator 
8 

yp in the form 2 y' k Pic 

i 


Introducing a generalized (non-unitary) transformation matrix L 
according to the condition 


y'V>=Uy k L 


(where y k = y x , y 2 = y v , y s = y z ), we get 

(TP)<f> = L 1 [ J, y k {p' k -(jp'k L-L 14 , 

whence it follows that the transformed Dirac equation for the new wave 
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functions ip' = Lip will differ from the original one in the same way as 
in the case of orthogonal coordinates, the operators p' k — (h/27ri)d/dq k 
being replaced by 

= 2 Ukrwu 061 ) = ^ [1_(logZ/M-1J - 

We shall not determine here the matrix L for the general case of 
non-orthogonal coordinates, for it is not of practical interest. 

The preceding results can be further generalized by introducing four¬ 
dimensional transformations, involving not only the space coordinates 
but also the time. Such transformations can be used to include the 
effects of the gravitational field on the motion of the electron in 
accordance with the relativity theory of gravitation. These considera¬ 
tions lie, however, beyond the scope of the present book. 

In conclusion, the following transformation property of Dirac’s equa¬ 
tion should be mentioned. 

The electromagnetic field is represented in Dirac’s equation by the 
potentials A, <p . Now from the relations E = — dAjc&t, H = curl A, 
it follows that the electromagnetic field strengths are not altered if A is 
replaced by A' — A-f-Vy and (p by <p' = <p-~dx/cdt, where y is an arbi¬ 
trary function of the coordinates and of the time. Since it is the field 
strengths and not the potentials which have a direct physical meaning, 
the above transformation of the potentials must be irrelevant for 
Dirac’s equation; that is, the transformed equation 

[(p t +e4>')/c+ r (p-eA'/c)+y 0 mc]>f,' = 0 
must be equivalent to the equation 

Lta+e<£)/c+r(p— eAlc)+y 0 mc}fi = 0 
with the original potentials. This is easily verified, the transformed 
wave function ip' being connected with the original one by the equation 
p' — e i2rre ^ hc ip. So long as y is a real quantity (as of course it must be), 
the two functions, or rather function-quadruplets, ip and ip' correspond to 
identical values of the probabilities and thus determine the same motion. 

This transformation can be considered as a special transformation 
of coordinates, the transformation matrix L being defined as the pro¬ 
duct of the matrix 8 = 1 by the function e i 2 rrc x^°. It is clear that the 
coordinates are actually not affected by a transformation of this type.— 
We see at the same time that the introduction of our electromagnetic 
field can be described in a geometrical language as a generalization of 
ordinary coordinate transformations, the quantities ( hj 2 rti)dLjdq k being 
replaced by eA k jc. 
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THE PROBLEM OF MANY PARTICLES 


37. General Results, Virial Theorem, Linear and Angular 
Momentum 

The problem of many particles has been considered already in the first 
part (Chapter IV) on the basis of the non-relativity mechanics of a single 
particle. Using the method of the configuration space, we arrived, in 
the case of two different particles, at the equation (101), which in the 
general case of a system of n different particles with the masses 
and the potential energy U(x 1 ,y v z 1 ;...;x n ,y ni z n \t) can be 
written in the form 




where 

Using the notation 

Pk== ^ k ’ ^ Pk * 


VI 


e 8 8 2 a 2 
to)l + 8yl + tek‘ 


h 8 
27 ri dx k 9 


Pin, ■ 


A A 

27ri dy k 


Vkz 


and 



we can rewrite (313) in the standard operator form 


(313) 


A JL 

2ni dz k 

(313a) 

(313b) 


(H+pM = 0, (314) 

p t denoting as usual — . while H represents the energy operator 

2sjt\ ct 

or Hamiltonian for the system under consideration. It agrees with the 
classical expression of the energy if the operators p* are regarded as 
representing the momenta of the separate particles. The wave- 
mechanical equation (314) thus corresponds to the classical energy 
equation H — W = 0 if —p t is replaced by the value of the energy W. 

This correspondence has exactly the same character as for a single 
particle, for which it has been discussed in detail in Chapters I and II. 
We need not repeat here all that has been stated there, as well as in 
the following three chapters, concerning the matrix representation, the 
transformation, and the perturbation theory. It may suffice to remark 
that a system of particles, defined by the Hamiltonian (313 b), can be 

8695.6 3 B 
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dealt with from the mathematical point of view as a single particle 
moving in a space of 3 n dimensions , with the coordinates 

«■=,/($)*" «‘=VfS)=" ■ 

—7(?K. 

Here m is an arbitrary coefficient of the dimension of a mass, which 
can be regarded as the mass of the 'equivalent' particle. We can put, 
for instance, m = (m x m 2 ... m n ) lIn which gives 

dV = dx x dy x dz x ... dx n dy n dz n = dq x dq 2 ... dq 3n . 

The corresponding momentum components are defined in the classical 
theory by the formulae 

y, - »!«. - . p >. - 

They are represented in the wave-mechanical theory by the operators 


___ //ra\ __ //ra\ ft 0 

‘^ 1 ' VWi/ ^ V xwij/ 2m* * 

that is, according to (314 a), 


Psn 



ft d 
2 t ri dz n ’ 


. 3 ”>’ < 3i4b » 

just as in the case of a single particle. Expressed in terms of these 
coordinates and momenta the Hamiltonian (313 b) assumes the standard 
form 3n 

(314c) 

All the developments of the first five chapters of this part, referring 
to the motion of a particle in ordinary three-dimensional space, can be 
immediately generalized for the case of a symbolic particle representing 
a system of n ordinary particles in the 3w-dimensional configuration 
space. The generalization is in fact so simple that it is hardly necessary 
to dwell upon it. 

We shall therefore limit ourselves to the discussion of a few pecu¬ 
liarities connected with the physical meaning of the problem and to 
the possibility of completing and refining the theory in the same sense 
as has been done in the preceding chapter for the case of a single 
particle. 

From equation (314c) and its conjugate complex (H~p ( )tjj* = 0 
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(//* = Jd) we can obtain in the usual way the ‘conservation’ or con¬ 
tinuity equation 3n 

ap :+y 


dt 




o, 


(315) 


where p — tfnjj* is the probability density in the configuration space, and 

jx = A. nlr A 

JoL 27rm i r dq a 

the components of the 3w-dimensional probability current. If equation 
(315) is multiplied by the volume-element of the configuration space 

dv = dV l dV i ...dV n = (- m ’-. )h qi dq 2 ...dq, n 

Wli) ... 

and integrated over all this space, the result obtained is 


expressing the law of conservation of probability.! 

If, however, the integration is extended over the configuration space 
of the second, third,..., 7ith particle, while the coordinates of the first 
one, x , y, z , are kept constant, we obtain an equation of the usual 
three-dimensional form 


Ft Pl + dxj u ++ ~ °’ (315 a) 

where the quantities 

Pi --j...jpdV 2 dV 3 ...dV„ 

J '“ - M J ~J A ir - dF ‘ •••/ < - 

(315 b) 

can be interpreted as the probability density and current density for 
the first particle in the ordinary three-dimensional space. The same 
results hold, of course, for each of the other particles. 

In the particular case of a system of particles which do not act on each 
other the equation = 0 has multiplicative solutions of the 

form i ft = where ip k depends upon the coordinates of the fcth 

particle alone; we get accordingly in this case 

(provided the separate factors of i/j are normalized to unity) and con- 

| We shall assume for the sake of simplicity that the integral f p (IV is convergent, 
which means that the particles are bound to remain in a finite region of space. 
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sequently p = p 1 p 2 ...p n . This result was the starting-point of our 
discussion of the problem of many particles in Part I. In the general 
case p is, of course, different from the product p 1 p 2 ...p n ; this circum¬ 
stance corresponds to a mutual dependence of the particles, a depen¬ 
dence specified by the form of the potential-energy function U or also 
by statistical (i.e. symmetry) conditions, if the particles are all alike 
(see below). 

The function U may be assumed to have the form 

U — 2 U k (r k , 0+ 2 2 Uki( r ki)> 

k k^I 


the first sum corresponding to the action of external forces, which can 
depend upon the time explicitly, while the second represents the mutual 
action of the particles ( U M = U lk , r kl — Ir^.—rJ = distance between 
the &th and Zth particles). 

If U does not depend upon t, then equation (314) admits solutions 
of the form tp = ip^(x ly ... i z n )e- i2lTirilh i where and II' are the charac¬ 
teristic functions and the characteristic values of the energy operator 
satisfying the usual equation 

{H-H'W //' = 0. (316) 

In the case of a discrete spectrum of II the functions tp° fr are easily 
proved to be orthogonal to each other (in the configuration space), this 
orthogonality being a consequence of the self-adjoint character of the 
operator H, since 


fiHfi-ftHfi 


2m 2 dq k \ 


d fl\ 

[ fl bq k 


Another interesting consequence of this self-adjointness of H is the 
possibility of replacing the preceding equation by the variational 
equation, g J d y = 0> 


(316 a) 


with the accessory condition, 

j = 1, 

expressing the ‘normalization’ of the functions tp^. Using 


= - I^Wdv, 

J J d< la d( la 

(316a) can be rewritten in the form 

s f \JL y a ^ 0 -}- u>/j o *>j,o] dv = o, 

) [Znhn £ dq a 

which involves the first derivatives of ip only (dV = dq r ..dq dn ). 


(316b) 
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An interesting application of the variational equation is afforded 
by the following very simple and general proof of the virial theorem 
(due to V. Fock). Let us replace the function which is 

a solution of our problem, by the function ift' — ciff°(Xq v ...,Xq 3n ), which 
is obtained from it by multiplying each coordinate by a certain 
parameter A and introducing a normalizing factor c. Introducing 
further a new set of coordinates q £ —- Xq a> we can write the normaliz¬ 
ing condition f *//*(// dV -= 1 in the form J A- 3w 0'*0' dV' = 1, where 
dV' = dq[...dq 3n , which gives, on using the original normalizing con¬ 
dition, 

/ dv =j r'mrw) dr = i, 

c = A~ 3n . Using this value of c we can reduce the variational equation 
(316 a) or (316 b) to the form &H’/dX ~ 0, where //' is the value of the 
integral (316 a) or (316 b) which is obtained by replacing the function 
0° by the function 0'. Its minimum value corresponds, of course, to 
A = 1, which is the solution of the equation dli'/dX = 0. 

Now using the coordinates q' k > we have 


777 _ rr A2 h 2 x fy°*(q') mq'), 

J [ 8A»Z 8q x ' bq x + 


so that the preceding equation assumes the form 

f 2 3n . o /n/_/\ i 8* 


J - - «■ 


Putting here A = 1 and q' x = q a , we get 




a = l 


(317) 


where T denotes the probable (average) value of the kinetic energy of 
the system, and V q a — its ‘virial’. We have obviously 




, dU , 
tyk 


dU\ 


If the potential energy is a homogeneous function of the coordinates, 
this expression reduces to the product of U with the number specifying 
the corresponding power. In the special case of a system of electrified 
particles obeying the Coulomb law—which is approximately the case 
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with any actual material system constituted by protons and electrons, 

2T = —U (317a) 

total energy of the system), 

T=^-W. (317 b) 


we must have 

or, since T-\- U == W ( 


It should be remembered that these relations hold only so long as 
the particles remain actually bound to each other, which is expressed 
mathematically by the convergence of the integral J |0| 2 dF, a con¬ 
vergence that subsists so long as the energy W of the state under 
consideration belongs to the discrete spectrum. It should further be 
remarked that they remain valid if some of the particles are treated as 
fixed centres of force producing an ‘external’ Coulomb field of force. 

We shall now establish a few other general laws which hold for a 
closed system of particles, i.e. a system unaffected by external forces, 
such as an isolated atom or molecule, etc. 

These laws are the exact equivalents of the laws of classical mechanics 
concerning the conservation of the energy, momentum, and of the 
moment of momentum (or angular momentum) of the system. The 
first of them has been stated already. The other two can be established 
with the help of the relation 


^ = [II. i’] = —(IIF-FH). 

(tt lb 


We put F = p = 2 P*, 

1 

or F = M = ir*xp*, 

in accordance with the classical definition of the total momentum and 
angular momentum (the origin from which the vectors r k are supposed 
to be drawn can be chosen arbitrarily). 

Taking the x-component of p, we have 

[H, Px ] = [V, Px ] = 2 [U, Vk]C } = -2 

Now — dU/dx k represents the force acting on the kth particle in the 
direction of the x-axis; so long as there are no external forces, the sum 
of such forces for all the particles must obviously vanish. Hence we get 



0 . 
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In a similar way we have 


[H,M X ] = £ [H,y k p kz ~z k p kv ] 


1 {{ H . Vk]Picz+yk[H > Pkz\-[ H > ~k\Pkv+ z k[H > Pk V ]}> 


that is, since 




dll 1 r t.t i d/I 


d_U 

dz. 


, etc., 


The first sum vanishes since p ky and p kz commute with each other, while 
the second is equal to the ^-component of the vector (r fc xF t ), where 

k 

F/. is the force acting on the fcth particle. We thus get 


^M = 2r*xF*, 
dt f 

just as in the classical theory. It is easy to see that in the case of 
central forces, which we are considering, the vector £ T k X ^ (repre- 

k 

senting the resulting torque of all the forces acting on all the particles) 
vanishes. We have in fact, putting F* = 2 F„. and taking into account 

l*k 

that F u = -F,* = /«(r*-r,), 

| r* x F* = i J Jj, <r fc -r,) X F„ = 0. 

Hence it follows that ~ M = 0, 


i.e. the conservation law for the resulting angular momentum. 

This result, as well as the preceding one, can be obtained by another 
method based on the invariance of the energy operator with regard to 
a transformation of the coordinates (and momentum components) in¬ 
volving a shift of the origin and a rotation of the axes about it. Let 
P be some fixed point in space (or in the configuration space) and P f 
another point which in the new system has the same coordinates as 
P has in the old one (x' k = x ky etc.). If f(P) is some function of the old 
coordinates, then the transformed function will be defined by the con¬ 
dition Tf(P) = f(P'), T denoting the transformation under considera¬ 
tion. The coordinates of the point P f in the original system are defined 
by the linear transformation equations 

X k = ^0+ a U :r I + a 12 2/A- + a 13 2: A- ^ 

Vk ” yoH"^l®i + a 28yfc"t‘ a 28 2 i J (& = 1 , 2 ,..., n). 

Z k ~ 2 0 + a 31 X A:+ a 32yA: + a 33 S i' j 
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In the special case of an infinitesimal transformation these equations 

reduce to , * . 

*k—*k = * x k = *o+"ir**-"*y*» 

Vk-y'k = % = yo+o>z x 'k-^x z 'k’ 

z k— z k ~ ^ z k ~ 2J 0 +O ) x ylc — WyX'ki 

where w x , o) y , a> z are the components of an infinitesimal rotation co, 
while x 0 , y 0i z 0 are the components of an infinitesimal displacement r 0 . 
We obtain in this case 

w «/</•)+1 (£*>+£**+£*.) 

2i + ^l + 

the derivatives of / being taken for the point P. Neglecting small 
quantities of the second order, we can replace in this equation the 
primed letters (referring to P') by the unprimed (referring to P), 
which gives 9 • 

Tf(P)=m+l[(r 0 -p+'»-m] P , 

where p = P*> while M denotes, as before, the operator of the 

k 

resulting angular momentum. We thus see that an infinitesimal trans¬ 
formation T can be represented by an ordinary linear differential 
operator 9 • 

^=l + x(r 0 -P+ w *M). 


+ w * 




Now it is obvious from symmetry considerations that the energy H 
remains invariant under a transformation of the type T since the latter 
alters neither the value of the potential energy (depending on the 
relative position of the particles only) nor the expression of the kinetic 
energy operator (the operators V£ being independent of the orientation 
of the coordinate system or of the position of its origin). This circum¬ 
stance can be expressed by the condition THi/j — HTi/j y that is, 
HT = TH, which, on the other hand, means that the operator T repre¬ 
sents a constant of the motion. In view of the arbitrariness of the 
(infinitesimal) vectors r 0 and u>, the equation T = const, is split up 
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into two independent equations: p == const, and M = const., expressing 
the conservation law of the resulting linear and angular momentum of 
the system. 

These laws are, of course, no longer satisfied in the presence of 
external forces. If, however, the latter reduce to an attraction to a 
fixed point—as in the case of a system of electrons revolving about 
a fixed nucleus supposed to act like a point-charge—then we still have 
M = const. In the presence of a homogeneous field—magnetic or 
electric—parallel to a fixed direction in space, the energy operator 
remains invariant for rotations about this direction only, and we obtain 
accordingly the conservation law' for the corresponding projection of 
the angular momentum, the components of the latter in the perpendi¬ 
cular directions being no longer constant. 

The operator = r^xp* representing the angular momentum of 
a single particle satisfies, as we know, the relation 

m*xm* = - Am*. 

Replacing by the resultant angular momentum operator M = 2 M*, 

k 

we have 

MxM = JM t xM*+^(M 4 x M,+M, x M k ) = | M* x M k , 

since the operators M k and M, referring to different particles obviously 
commute with each other. We thus get for the resulting angular momen¬ 
tum the same relation as for the component ones, viz.: 

MxM = ;M. (318) 

27 Tl 

It has been shown in Chap. Ill, § 13, that it is possible by means of 
the matrix method to derive from this relation the matrix elements 
of M in a representation specified by the condition that M 2 and M e 
should be diagonal matrices (corresponding to a given value of the 
energy). The number of particles involved is obviously immaterial (so 
long as M commutes with H) and the results obtained before for the 
case of a single particle can be directly applied to the present case. 
We thus obtain, on denoting the angular quantum number by j (instead 
of l as before) and the axial one by m, 

M h = (318 a) 


3096.6 


30 
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Wm.m = (~j j) 

(M x +iM y ) m+ i. m = ^V{(i+l) 2 -(^+l)V“” 

(M x -iM y ) mm+ j -= ^V{0'+i) 2 “( wl +l) 2 } e "’’“'" 

[cf. c.g. (96) and (96 a)]. As has been pointed out in § 13, the number 
y can assume, from the matrix-theory point of view, both integral and 
half-integral values (the values of m being of the same nature); half- 
integrai values occur, however, only if the spin of the particles is 
included, and if M refers to total not orbital angular momentum. 

38. Magnetic Forces and Spin Effects 

A generalization and refinement of the preceding theory along the same 
lines as for a single particle—i.c. the establishment of a wave equation 
(H-\~Pi)4t = 0 which would describe the behaviour of a system of 
particles in agreement with the relativity theory, taking account of 
magnetic forces and of the spin effect—is a problem which admits only 
of a partial and approximate solution. This circumstance is not charac¬ 
teristic of the wave mechanics, for we meet with a similar situation in 
the classical mechanics. The latter can be formulated in a relativistically 
invariant form for the case of a single particle moving in an external 
electromagnetic field—that is, in a field which is supposed to he known 
a priori and specified by the potentials <£ and A . The more general 
problem of the motion of two or more particles, acting on each other 
according to the laws of the classical electromagnetic theory, cannot be 
solved with the help of a single equation involving the coordinates of 
all the particles for the same instant of time, for according to this 
theory the action emanating from each particle travels through space 
with Si finite velocity (c). The force acting on a particle (1) at the instant 
t depends upon the position and motion of the other particles (2, 3,...) 
at previous instants t 12 = t-R^/c, etc., R 12 being the distance between 
the point where (1) is at the time t and the point where (2) was at the 
time t u . 

This fact, usually denoted as the law of retarded action, alone pre¬ 
cludes the possibility of treating the problem of motion and interaction 
of a number of particles by means of a single equation of the Hamilton- 
Jacobi type. We must, instead, write the relativistic equation of motion 
for each individual particle assuming the electromagnetic potentials 


§37 


(318b) 
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produced by the other particles to be known, and furthermore a set of 
equations defining the potentials produced by each particle, its motion 
being supposedly known. 

This problem allows, however, only an exact formulation. It cannot 
be solved exactly even for the simplest case of two particles. And 
there is no doubt that such a solution, if it could be obtained, would 
be in contradiction to the experimental facts. Assuming that the latter 
can be described adequately, so far as the motion of a particle in a 
given external field is concerned, by means of the relativistic wave 
mechanics, we must find a method of describing adequately the electro¬ 
magnetic field produced by a particle, whose motion is specified in 
terms of wave mechanics, i.e. in terms of the probability theory. This 
means that together with the classical mechanics we must abandon the 
classical electrodynamics, based upon the idea of exactly specified motion, 
and replace it by a new ‘quantum electrodynamics’, not involving 
this idea. 

We shall consider this problem more closely later on (Chapter IX) 
and shall confine ourselves here to the more modest task of incor¬ 
porating into the wave-mechanical theory the magnetic forces, and 
other effects connected with them, neglecting those which are due to the 
retarded character of the interaction between the electrified particles— 
electrons and protons—constituting matter. 

So far as the action of an external magnetic—or electromagnetic— 
field on a system of such particles is concerned, the required generaliza¬ 
tion of the previous theory presents no difficulties. We have merely 
to replace in the expression of the energy operator the momentum 
operators of the single particles p*. by the differences 



where A k — A (x k ,y ki z k ,t) is the vector potential of the external field 
at the point where the particle in question is supposed to be situated at 
the instant t under consideration. 

Putting further U = J e k ^ k +U\ where <f> k = <f>(x k ,y k ,z k ,t) is the 
scalar potential of the external field at the point (x k ,y ki z k ), and 

U == 2 2 mu ^ ua * potential energy of the particles, we get 

t<k r ' k 
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In the case (usually met with in practice) where the square of A can 
be neglected as well as divA, this expression reduces to the form 


" -1 [ sc r *-< S , A *- p * + "*‘]+22 ‘ f ‘- <»»•> 

We meet a much more difficult problem when we try to incorporate 
in the energy operator terms representing the non-statical interaction 
of the particles with each other. This problem can be solved approxi¬ 
mately if we neglect the retarded character of the electromagnetic 
actions and define accordingly the vector-potential produced by a par¬ 
ticle with a charge e i and velocity v* at a distance r ik by the expression 


e i y il( cr ik)- 

The total value of the vector potential A k at a given point (&) is then 
equal to the sum of the part A k due to the external field and that 



due to all the other particles. The total momentum of 


the ki h particle, p k — m k v k ~\~(e k /c)A k , is thus given by the expression 


P* = 2feV<+(e*/c)A£ ( (320) 

l 


where g H — m u if i — k and e i ^/(cV^) if i ^ Jc. 

The corresponding expression for the total kinetic energy T of the 
whole system [equal to the sum of the ordinary kinetic energy ^ l m k v t 

and of the mutual kinetic energy T' = 4 2 ( e kl c ) y k‘^'k] * s 

T = (320 a) 

i k 

Putting p k —(eJc)A k = p' k and solving the equations P k~li,9ki v i 

i 

with respect to the v/s, we get v t = 2 g ik p k , where g ik — 
g being the determinant \g ik \, and 

T--il2g {k P'i-p' k . (320b) 

i k 

The classical Hamiltonian H is equal to the sum of this expression and 
the potential energy U — The simplest way to obtain 

k 

the corresponding quantum Hamiltonian consists in replacing the p n s 
in (319b) by the operators (A/277?')V—(e/c)A°. Since, however, these 
operators do not commute with the coefficients g ik we might just as 
well write Pig ik p k instead of g ik Pip' k or, more generally, f ~ l p'ig ik fp' k , 
where/is any function of the coordinates. If (following L. Landau) we 
put / == *Jg we obtain for the quantum T an operator which can be 
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considered as a generalization of the ordinary Laplacian in a curved 
space with the line-element ds 2 — ^ ^ g ik dq i dq k . 

We shall now discuss some further complications of the theory of 
a system of electrons, namely, those connected with the spin effect. 

In the case of a single electron or proton this effect can be accounted 
for approximately by introducing, in addition to the three space co¬ 
ordinates of the particle x , y, z, a fourth ‘spin coordinate’ £, able to 
assume two values only. These values correspond, as we know, to two 
opposite orientations of the spin parallel to a fixed direction, that of 
the z-axis say, or, more exactly, to the two characteristic values of the 
z-component of the spin matrix a z . We thus get, instead of a single 
wave function i/j(x, y, z) describing the motion of the particle in ques¬ 
tion, a function doublet ip(x , y , z, £) which can be dealt with as a linear 
two-dimensional matrix with the elements ^ x (a;,y,z) = «/r(a:,y, z, 1) and 
ifj 2 (x,y,z) = ifj(x,y, z, 2), I and 2 being the two values of £. Instead of 
these two values it is often more convenient to use — | and +|, which 
are equal to the respective values of the z-component of the spin angular 
momentum expressed in the standard hj27T unit. 

The energy operator, as well as all the other operators referring to 
the particle, must be defined accordingly as a square two-dimensional 
matrix involving either the spin matrix a or the unit matrix which is 
equivalent to the square of any component of a. 

These results can easily be generalized for a system of elementary par¬ 
ticles (electrons or protons) so long as their mutual action is neglected. 
The wave function if describing the behaviour of the whole system can 
be defined as the product of the functions i/j k — *Pk( x k’ Vk’ z k’£k) referring 
to the individual particles (k -- 1,2The expression multi¬ 
plied by the volume-element of the configuration space dV = dV x ...dV n 
— dx l dy 1 dz 1 ... dx n dy n dz n is to be regarded as a measure of the proba¬ 
bility of finding the system in the corresponding configuration with 
the specified values of the spin coordinates. The number of such 
specified values is obviously equal to 2 n , so that there are 2 n states 
corresponding to each configuration and differing from each other by 
the orientation of the separate particles inasmuch as this orientation is 
specified by the characteristic values of a s . The total probability of a 
given configuration, irrespective of the orientation of the particles, is 
measured by the sum 

iz-zrt^iw 

i, (. (. c 

extended over the two possible values of each of the spin coordinates. 
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In the case of a motion belonging to a discrete spectrum this sum must 
be normalized according to the condition 

f20*lMF= l > 

J C 

the integration being extended over the whole configuration space. 
With regard to the definition of 0, we have 

f2 dV = JI 0*0 X dV x ... I i ft0„ dV„, 

f 1 K't’k dV k = J ( 0 * 10 * 1 + 0*2 0 **) ^* = 1 . 

J {i J 

where 

0*a = 0*{ l ( j? *> y*, 2 *) = 0*( 3 '*> y*, 2 *- £*) (£* ==i,2). 

The product iff considered as a function of the space coordinates alone 
can be dealt with as a linear matrix of 2 n dimensions 

0{ = 01{,02{, 

This involves the use of operators which should be defined as square 
matrices of the same rank. Such an operator, i 1 say, can be defined 
by the equation 

where is an operator of the ordinary kind with respect to the 
space coordinates x v ...,z n and the corresponding momenta, specified by 
two sets of particular values of the spin coordinates, £' = 
and £"=(£",£"Each of the individual wave functions ^ 
satisfies the matrix-operator equation 

(■ H *+ 8 *i > <)0* = 

where S k is the two-dimensional unit matrix referring to the £th particle 
(with the elements 8^ £-). The factorized wave function i/j is easily seen 
to satisfy an equation of the same type, 

(H+SpM = 0 , ( 321 ) 

where 8 is the 2 n -dimensional unit matrix with the elements 

hi- = 8 £; c i; - 8 ftft> ( 321 a ) 

and U the energy operator defined by the formula 


(+0)^ = 2 h'i’h"' 


H ii- = -Sis tr 8 £; c - hii+h c a 



(321 b) 


being the elements of the ordinary two-dimensional matrix 
operator referring to the k th particle. 

Equation (321) can naturally be extended to functions *p of a more 
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general type, equal, for instance, to a sum of particular solutions of the 
simple product type. It can be further generalized in order to account 
for the mutual action of the particles by adding to it terms representing 
the interaction energy multiplied by the unit matrix (321 a). (In pro¬ 
blems of the atomic theory involving only a small number of electrons 
the mutual kinetic energy T' can be neglected.) There remains, how¬ 
ever, still one step in this generalization, which consists in the addition 
to the interaction energy of terms characteristic of the spin effect. We 
can solve this problem in a tentative way with the help of the approxi¬ 
mate theory of § 30. We found there that the additional ‘spin’ force 
acting on a particle (electron) in a given electromagnetic field E, H can 
be derived from the energy operator 


”4°- H + 2m7 C E ' (PXO) ] 


[cf. equation (261 a), where u is replaced by p]. It is natural to suppose 
that this result will still be valid for a system of particles, if H and E 
are defined as the total field acting on the given particle due both to 
external causes and to other particles constituting the system. The 
field E, H produced by a certain particle at a distance r can be derived 
in the usual way from the potentials <f>, A defined by the following 
formulae: 


4 > 


- p+r 
m 0 cr H 


oxr. 


The first term in A represents the ordinary electromagnetic field of 
a moving point-charge, while the second is introduced as an equivalent 
for the field produced by an elementary magnet with a moment /xa. 
Neglecting the electric field due to the variation of A with the time, 

i.e. putting e 

E = —W<f> == -r 




and H = curl A = 

m 0 cr* " r° 

we get for the operator of the spin interaction energy the following 
expression: ^ u’ s , (322) 

when, <m.) 


and ^ = 22 (o *' r * i)(0irw) "°* o< l- (322b) 

k<i ki L Art J 

In deriving the term U f 8 which represents the linear or electromagnetic 
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part of the spin interaction energy we have simply summed lip the 
contributions of all the particles concerned (r^ denoting the radius 
vector from the ith particle to the &th), whereas in deriving the 
quadratic or purely magnetic part of the energy U" s , which is sym¬ 
metrical with regard to each pair of particles, we have taken each pair 
only once (as indicated by the condition Jc < i). 

It should be noticed further that in adding U a to the Hamiltonian 
H in (321) we must multiply it by the unit matrix (321 a). This amounts 
to the multiplication of each term by those two-dimensional unit 
matrices only which refer to other particles than those represented by 
the matrices a. Dropping these unit matrices and neglecting the 
mutual kinetic energy we can write the total Hamiltonian in the form 
H = ZH k +U'+U e , (323) 

k 

where U s is defined by (322), while 


^k‘Pk^~ e k <t>k~ 


+ E*-(P*x ®*)] 


; (323 a) 


H k is the energy operator for the fcth particle, A k , <f> k , H*, and E A . denoting 
the potentials and intensities of the external electromagnetic field at 
the point (x k ,y k ,z k ). If this field does not depend upon the time, the 
equation = 0 admits solutions of the type i/j = i/j 0 h/ e~ i27Tlrtlh 

corresponding to a motion of the system with a fixed energy IT, the 
function being defined by (H—H')if rj r = 0. To each state or energy - 
level defined by the approximate equation to which it reduces if the 
spin effect of all the particles is neglected, there correspond, in general, 
2 n different states with slightly different energy-levels, which form what 
is called a 'spin multiplet’. The theory of such multiplets for the 
simplest case of a single particle has been discussed in the preceding 
chapter. The general results stated there (§29) about the orthogonality 
properties of the functions the matrix and supermatrix representa¬ 
tion of various physical quantities, the perturbation theory, and so on 
can easily be extended to the cason > 1. We shall not discuss these 
questions here, but shall leave some of them for a later section where 
they will be considered in connexion with Pauli’s exclusion principle for 
identical elementary particles (electrons or protons). 

The method which has been applied above for the description of the 
spin effect characteristic of such particles can be used in a somewhat 
generalized form for the description of the orientation or inner states 
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of complicated particles—such as atomic nuclei or whole atoms, etc.— 
so long as they are treated as moving material points. 

Let us consider, for example, a particle possessing an inner angular 
momentum (which may be due both to orbital and spin motion of the 
electrons and protons constituting it) of 8 units. Such a particle can 
assume 2s+ 1 quantized orientations, corresponding to the values 

ra~ — 5, — (s— 1), — (s— 2),..., + ($—1), +5 (324) 

of the ^-component of s. These numbers can be defined as the charac¬ 
teristic values of a matrix a z , which is the z-component of a matrix o 
representing the inner angular momentum of the particle in question 
(in units of The matrix elements of a x , a yy and u z are defined 

by the equations 

K+^„Wm = Vfo+lP-Oa + iJV"* ) 

K-^)m, m+ l = • ( 324a ) 

KL = m ) 

which are obtained from the equations (94 b), (96), and (96 a) of § 13 
(Chap. Ill), if M is replaced by ha/2 tt and l by s. The motion of such 
a particle in a given external field of force can be described in exactly 
the same way as this has been done above for the particular case s = ^, 
namely, by introducing in addition to the ‘external’ coordinates x , y, z, 
defining the position of the particle’s centre of gravity of an ‘inner’ 
angular momentum coordinate £, which should assume the values 
1 , 2 ,..., 2 « +1 , corresponding to the characteristic values (324) of a z . If, 
moreover, the additional energy of the particle in a magnetic field $ is 
represented by the operator /x£vo, we get a direct generalization of the 
Pauli theory of the spin effect, discussed in § 29. A similar generaliza¬ 
tion is obtained if we consider a system of particles—such as electrons 
and atomic nuclei—which differ from each other not only with respect 
to the charge and mass, but also in respect to the inner momentum 
number s or the multiplicity 25+1. A problem of this sort is met with, 
for instance, in connexion with the hyperfine structure of atomic 
spectra, due to the fact that the nuclei of many atoms actually possess 
an inner angular momentum and a very small magnetic moment asso¬ 
ciated with it. The magnetic field produced by the latter can* be 
specified by a vector potential of the same form, 

A = -sxr, 

r 8 

as for an electron (or proton), giving rise to an interaction energy of 

3595.6 3 D 
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the type (322 a, b), with a k denoting matrices of various ranks (2 for 
an electron; 1, 2, 3, etc., for a nucleus). 

These considerations show, by the way, that an electron can be 
visualized not as a point but as a spinning sphere, according to the 
classical model, in spite of the fact that in the Pauli or Dirac theory 
it is treated as a point. 


39. Complex Particles treated as Material Points with Inner 
Coordinates; Theory of Incomplete Systems 


Complex particles can be treated as elementary, i.e., material points if 
inner coordinates and momenta are introduced to specify their orienta¬ 
tion, the total value of the inner angular momentum, if it is variable, 
as well as other quantities, serving to describe their inner properties. 

Let us denote by x (x,y,z) the coordinates of the centre of gravity 
of the particle, the coordinates specifying the relative motion of the 
elementary particles (electrons, protons) constituting it being denoted 
by q (3'i,g r 2 v)* Let us divide further the energy operator 11 into three 
parts, K , L, M, where K is a function of the x’s (and of the associated 

h P \ 

momenta represented by the operators — —1, L a function of the q's 

hd 

(and of the associated momenta — — as well as of the spin variables), 


while M is a function of both. We shall assume them all to be inde¬ 
pendent of the time and shall denote the characteristic values and 
functions of L by L' and Xl'(Q) respectively. 

The solution of the equation — 0 for a stationary state 

of the complex particle (supposed to move in a given external field of 
force) can be represented in the form 


0H' “ (325) 


where <f> L '{x) are certain expansion coefficients with regard to the 
variables q , being themselves functions of the variables x. These 
functions can be determined by substituting (325) in the equation 
— 0, which gives 


( K <f>L')Xu+{ L 'Xv+ M XL-4>L' = 

Now the operator M applied to the function xv an d thereafter to the 
function <f> L > gives the same result as the operator 

y Mr.r, Vr* 
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acting directly on <f> Vi where 

M L’V = / Xl-Mxu <k 

are the matrix elements of M with regard to the characteristic inner 
states of our complex particle (these matrix elements are functions of the 

h d \ 

x and, in general, of the associated operators ^ —J. We thus get 
1, XL'[K<f>i/+(L'—.H')<f> L .]+ J 2 Xv M vl -< j> L - — 0, 

l 7 i ' I * 


or, interchanging the summation indices in the double sum and equating 
to zero the coefficients of the functions xl'> 

2 ^L'L* $L m ~ (H '—(325 a) 
V 

The system of equations can be written in the form of a single 'operator- 
matrix’equation (325 b) 


if (f> is defined as a one-column matrix with the elements <j> L . and J as 
a square matrix operator with the elements 

•4'r ~ K ~t~ ^L'L”> (325 c) 

8 VU denoting the unit matrix and J<f> the one-column matrix resulting 
from the matrix multiplication of J by <j>\ J f — H'—L ' are the charac¬ 
teristic values of J. We can also regard <f> as a vector and J as a tensor 
in the state-space, corresponding to the inner motion (and orientation) 
of the particle under consideration, and specified by the quantum 
numbers L' (which must include besides the energy other constants 
of the inner motion). We can finally regard L' as a sort of 'inner’ 
coordinate (or coordinates) of the particle so long as it is treated as 
a material point—in the same sense as this is done in Pauli’s or Dirac’s 
theory of the spinning electron, with the only difference that the number 
of possible values of U is in general infinite, instead of being equal to 
2 (as in Pauli’s theory) or to 4 (as in that of Dirac). The 'inner’ quantum 
numbers corresponding to these additional coordinates in the functions 
<f>(x , L'), compared with the functions <f> K '(x) which are the solutions of 
the ‘unperturbed’ equation (K—K')<f> K '{x) = 0, can be represented by 
the values of the difference J'—K' for the same value of K'. 


The different solutions of the equation 

J (f)j, = J'cfrj', 

i.e. solutions referring to different values of J', if quadratically in¬ 
tegrate, are orthogonal to each other and can be normalized according 


to the equation 


J fy*dx — 8 


j*j*> 


(326) 
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where <j>^ is the one-row matrix formed by the elements which are the 
conjugate complex of those constituting the one-column matrix 
Introducing L' as an inner coordinate, we can rewrite the preceding 
equation in the form 

/J «•(*, L')<f>j.(x, L’) dx = 8 rr . (326 a) 


This result easily follows from the self-adjoint character of the operator 
matrix J, which in its turn is a consequence of the self-adjoint character 
of the complete Hamiltonian II. 

All quantities referring to the translational motion of the particle 
under consideration must be represented by operator-matrices of the 
tyP e _ / h r) \ / ]i $ 


l L’L 




the inner coordinates appearing twice—in the role of ordinary co¬ 
ordinates, and in that of the momenta. The matrix element of such 
a quantity with regard to two states of motion, specified by the func¬ 
tions <j>j' and <£ r , is given by the expression 

= / & Fh- dx = Jdzg 1 4>*A*> L')F(L', L'WAx, L"). (326 b) 


This expression is a generalization of those appearing in the theory of 
Pauli and Dirac, with the inner (‘spin’) coordinate assuming two or 
four values only. 

Let us suppose, for example, that the particle is an ion (charge e, 
mass m) moving in an electrostatic field, which within the particle can 
be dealt with as practically homogeneous and equal to E — — VK(x, y , z) 
where V(x,y,z) is the electric potential at the point (centre of gravity) 
representing the particle. We then have, by the ordinary Schrodinger 
theory, , 2 

*=- 8 ^ V ‘ +eF ^’ 


as for an elementary particle with a charge e and a mass w, and further 
M = — 

where P is the resulting electric moment of the particle, the position 
of the electrons and protons being referred to the point (x y y,z) t The 
operator L which specifies the inner motion of the particle—in the 
absence of the external electric field—need not be considered here. All 
we need to know are the matrix elements of P with regard to the 
stationary states representing this inner motion, the translational 
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motion being determined by an equation of the type (325 a) with 
M l , l . = —E(*)-P iX .. 

For a particle moving in an inhomogeneous magnetic field (a problem 
met with, for example, in the Stern-Gerlach experiments), we get in 
a similar manner 


being the matrix elements of the resulting magnetic moment of 
the particle. 

The preceding theory can be easily extended to the general case of 
a system of complex particles, considered as material points, or to the 
still more general case of any 'incomplete’ system A, which is a part 
of a complete system AB, specified by the Hamiltonian //. If the part 
of If corresponding to A taken alone is denoted by K , that corre¬ 
sponding to B with L and the rest, representing the mutual action or 
‘coupling’ between A and B with M , we obtain for the motion of A the 
same results as before, the coordinates x specifying in the general case 
the configuration of A, and </>(#, L') being the probability amplitude of 
this configuration for a given stationary state L' of B. 

In the case of two particles, for example, we have, denoting by x v x 2 
the coordinates of the respective centres of gravity and by q v q 2 the inner 


coordinates, 




(327) 


since the operator of the inner motion (without interaction) L obviously 
reduces to the sum of the corresponding operators L x and L 2 for each 
of the two particles taken separately. Putting further 

<t>L’( X ) = 4>L[l/ t ( X V * 2 )> ( 327 a ) 


we obtain for <j> an equation of the same type as before. If the two 
particles are treated with regard to their mutual action as electrical 
dipoles, their mutual potential energy will be represented by the 
operator , r o 

where r is the radius vector drawn from one particle to the other (with 
the components x 1 —x 2 , etc.), whence 


Mjjl* = ^( r P iL;L:)( r ^ (327 b) 


It should be noted that in spite of the incompleteness of the system 
A , specified by the energy operator K+M which represents its own 
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energy and the action on it produced by the 'ignored’ part B, the motion 
of A is exactly determined if the operator M is defined as a matrix 
with regard to the stationary states of B. This method of describing the 
motion of an incomplete system A is especially convenient if its coupling 
with B is relatively weak and if for some reason we are not concerned 
with the details of the motion of B. As a further example of a (rather 
unconscious) application of this method we shall mention Fermi’s theory 
of the hyperfine structure of spectra, due to the mutual action of an 
electron (A) with a nucleus (B) possessing a magnetic moment. The 
motion of the electron is determined in this theory with the help of 
Dirac’s equation, the action of the nuclear magnetic moment on the 

electron being represented by the vector potential A = ~ a X r, where 

r* 

a is the well-known matrix of rank 2$+l, specifying the angular 
momentum of the nucleus hs/27r. The wave function ^ must be treated 
accordingly as a rectangular matrix with four columns (corresponding 
to the four components of the Dirac wave function) and 25+1 rows.— 
We shall discuss later another interesting application of the same 
method (due to Heisenberg) to the problem of the interaction between 
matter (A) and radiation (B), the latter being described by ordinary 
electromagnetic oscillations, whose amplitudes are treated as matrices 
(Chap. IX). 

If the interaction energy M is relatively small so that the second 
term on the left side of the equation, 

L')+ l L ,f )<f>(x } L") = (ir-L'Mx, L') y 

can be treated as a small perturbation, this equation can be solved 
approximately with the help of the ordinary perturbation method 
starting with the solution of the equation which is obtained by dropping 
the term M. More exactly, since our problem becomes degenerate, we 
must consider the whole set of solutions corresponding to the same 
unperturbed energy-level H'—L' — K Writing ( K',L ') for J' y where 
L f denotes an inner quantum number independent of L ' but identical 
with it in regard to the range of its possible values, f we can define an 
orthogonal and normal set of solutions of the unperturbed equation 
K<j> = K'<f> by the formula, 

filCL'fa’ ~ ( 328 ) 

where o> K <(x) denotes the solution of the above equation leaving out of 

f In tho same sense as the spin coordinate £=1,2 and the spin quantum number 
A » 1,2 for a single electron of the Pauli theory (cf. § 29). 
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account the inner coordinates, while hj JL > are the elements of the unit 
matrix. The function oj k \x) is supposed to be normalized according to 
the ordinary condition J \w K >{x)\ 2 dx = 1; it is supposed, moreover, to 
be the only solution of the ordinary Schr5dinger equation Ku> = K'w 
corresponding to the energy-level K' (so that no further degeneracy 
outside of that which is specified by the quantum numbers L f need be 
considered). 

The approximate solutions of the exact equation, ‘stabilized’ for the 
perturbation M , can be defined, according to the general theory, as 
linear combinations of the functions (328) 

cf>j(x,L') - 2 c X'&rz/(*> L'). (328 a) 

L' 

The sum reduces in the present case to a single term, so that we get 


<t>A x ’ L ') — c L'^K‘i x )- (328 b) 

If M were an ordinary operator not involving the inner coordinates, 
then the coefficients of the transformation (329) for each admissible 
value of the perturbation energy H' — U—K* — AA r/ (together with 
the latter) would be determined by the system of equations 


2 Ml L" c L" = AX'c*. 

where Mjjjj. are the matrix elements of M with respect to the unper¬ 
turbed functions. These equations remain valid in the present case 
provided the matrix elements of M are defined according to the general 
formula (326 b) which gives, in virtue of (328 a), 


^vlt — J oy^\x)M{L\ L )co K >(x) dx. 

Denoting this expression by and dropping the bars over the 


U s, we get 


^1VL\K’L mC ir 


A K'c l .. 


(329) 


We shall not stop here to discuss these equations, since they are 
practically identical with those of the ordinary perturbation theory. 

It should be added in conclusion that the preceding theory can easily 
be generalized for non-stationary phenomena corresponding to an ex¬ 
plicit dependence of the energy operator H upon the time. So long 
as this dependence does not affect the operator A, it is sufficient to 
replace the characteristic value H ' of H in (325 a) by the operator 
h d 

—p t = — — --, the functions <f> r being determined by the equation 
2 t t% ot 


(329 a) 
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40. Identical Particles (Electrons) and the Exclusion Principle 

Returning to elementary particles, we shall now take into account the 
restrictive condition which follows from the identity of all the electrons 
or all the protons and which is expressed by Pauli’s exclusion principle 
or by the Dirac antisymmetry principle for the wave functions <f> 
describing the behaviour of a system of electrons or protons (see § 22, 
Part I). For the sake of simplicity we shall apply this principle to 
a system of electrons only, treating protons and atomic nuclei as fixed 
oentres of force. Such a treatment can actually be applied with suffi¬ 
cient accuracy to many problems connected with the structure of atoms, 
molecules, and material bodies; for in view of the relatively large mass 
of the atomic nuclei—protons included—they can be dealt with to a 
certain approximation as fixed material points, producing the external 
electrostatic (and also magnetostatic) field in which the electrons are 
supposed to move. 

We must, to begin with, check the validity of the Pauli principle in 
Dirac’s form—in the sense of its permanence in time—from the point 
of view of the generalized equation of motion, involving the spin 
coordinates, which has been established in the preceding chapter.f 

This equation can be written in the following form: 

2 H{x ,, &, Pl r, p„ KWft O 

= £»)> (330) 

i.e. as a system of 2 n equations for the set of 2 n wave functions 
where x k and p k stand short for coordinate triplets x k , y k , z k and the 
momentum components p kx , p ky , p^. The space coordinates of each 
particle, together with its spin coordinate, form a coordinate quad¬ 
ruplet; the same is true of the momenta, the momentum corresponding 
to the spin coordinate being replaced by a duplication of the latter, 
which gives to H its operator-matrix character. 

In view of the identity of all the electrons, H must be a symmetrical 
function with regard to the indices 1 , 2 ,... distinguishing them. If, 
therefore, the wave function is symmetrical or antisymmetrical with 
regard to these indices—i.e. with regard to all the coordinate quad¬ 
ruplets—at some instant t , its derivative dip/dt, and consequently its 
value for the next (or preceding) instant, will be so too. The symmetri- 

t It should be remembered that the permanence of the antisymmetrical character 
of the wave function has been established in Part I on the basis of the ordinary 
Schrbdinger equation for a system of identical particles without spin. 
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cal or antisymmetrical character of tp can be regarded therefore as 
a permanent property. The fact that for a system of electrons (or 
protons) antisymmetrical wave functions only must be used to in¬ 
terpret the experimental data has been discussed at length in § 22 of 
Part I. 

As the spin forces are very small compared with the electrostatic 
ones, a fairly good approximation (of ‘zero order’) can be obtained by 
totally neglecting them (as well as the magnetic forces of Biot and 
Savart, specified by the mutual kinetic energy T'). 

The energy operator-matrix reduces in this case to the product 

of the ordinary Hamiltonian operator for the system of particles under 

consideration: rr 

K = A(x v ...,x n ]p 1> ...,p u ), 

with the unit matrix (321 a). Limiting ourselves to solutions of the 
type </> = ip°(x 1 ^ 1 ,...,x n ^ n )e~ i2nK ' tth i which correspond to a motion with 
a fixed energy K\ we thus get, instead of (330), 

(K—K')*l> = 0. (330 a) 

This equation differs from that of the ordinary theory (not involving 
the spin) only by the fact that K is understood to contain as a factor 
the unit matrix and that *p is to be regarded as a function both of the 
ordinary coordinates and of the spin coordinates £ 1? ..., £ n . Since K does 
not contain the latter—or more exactly the spin matrices a v a n — 
these matrices must commute with K and represent consequently con¬ 
stants of the motion. The characteristic values of their z-components 
via — 2 m k = ±1 can be considered accordingly as additional spin 
quantum numbers specifying 2 n solutions of (330 a), that is 2 n de¬ 
generate states which belong to the same value of the energy K\ We 
shall distinguish these 2 W states with the help of the indices m v .„,m n , 
writing m short for the whole set of them. It should be remembered 
that the product of m k by A/2w represents the projection of the spin 
of the fcth electron on the z-axis. 

If we write £ k = —+ J instead of 1 and 2 respectively (as was done 
before), we can define a set of 2 n orthogonal and normal solutions of 
the equation {K—K')ip K > = 0 which belong to the same characteristic 
value of K by the formula 

*k’M) = ( 331 ) 

where S m £ = S OT ^ 2 n -dimensional unit matrix equiva¬ 
lent to (321a) and the normalized solution of the ordinary 

35W.6 3 E 
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Schrodinger equation ( K — K '= 0 not involving any spin coordinates. 
We have in fact, by the definition (331), 

f &'„'**'»■ dv = /12 'l’*K-mW)'l>K-nA*n dV = (331 a) 

J J £■ 

This form of the solution of the Schrodinger equation with the spin 
coordinates taken into account cannot , however , 6c reconciled with the 
antisymmetry condition for the functions */*, except when all the spin 
quantum numbers m v ... i m n have the same value (either \ or —-J). In 
this case is a symmetrical function of the spin coordinates, and in 
order to satisfy the antisymmetry condition we must define <f> as an 
antisymmetrical function of all the n coordinate triplets z v ...,x n . 

If some of the numbers m k have the value — l and others the value 
+ i, the function ijj as defined by (331) will not be antisymmetrical, 
whatever the type of the space factor <f). 

The spin factors S m £ can be used, however, in this case to obtain 
somewhat more complicated spin functions c(£) which are either sym¬ 
metrical with regard to all the variables £ n or with regard to some 
of them, being in the latter case antisymmetrical with regard to definite 
pairs of these variables. 

A symmetrical spin function e(£) can be formed by permuting the 
variables £* and £ k in those factors and S mi ^ for which m t =£ m k 
and adding the results. If instead of adding we subtract them from 
each other, we shall get a function antisymmetrical with regard to the 
pair of variables (£ i5 l k ). Putting for the sake of brevity 

= (S-uAu.-S-uAu,) = 1 (332) 

v(U, to = (8-u ( s +l ,a+8- l .£ i 8 + u,) = +*(£*> it) i' ’ 

we get for e(£) an expression of the form 

*#(£) = *(tv £.M£* £*) - £ 2 £«)» (332a) 

where ty(£ a f+i»‘»>£ w ) * s a symmetrical function of the n—variables 
£ n formed by taking the product of a certain number j of func¬ 
tions of the type v(£ ki £ t ) and of n—2(i+j) simple functions 8 mjt ^ k with 
the same value m' of m ki and summing such products for all non-trivial 
permutations of the variables f 2t -+ v ... t £ n : 

£n) ~ 2 v (£2i+l> £ 21 + 2 ) ••• v (^2i+2j-V ^ ) + l ••• 

(332 b) 

The numbers i and j fully specify the spin functions e^(£) for a fixed 
arrangement of the variables £ 1? ..., £ n . By permuting the latter we can 
obtain other functions of the same symmetry type. 
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Before, however, proceeding to such permutations, let us multiply 
the function (332 a) by a space factor which we shall assume to 
be symmetrical with regard to the pairs of coordinate triplets (x v x 2 ), 
(^ 2^1 > x 2 i) anc * antisymmetrical with regard to the rest. The 

pr ° dUCt (333) 

will obviously be antisymmetrical with regard to the pairs of coordinate 
quadruplets (x v x 2 , f 2 ), (z 3 , £ 3 , z 4 , f 4 ),..., (x 2i _ v :r 2i , £ 2 J and anti¬ 
symmetrical with regard to all the other coordinate quadruplets. It 
will have, however, no symmetry whatever with regard to permutations 
affecting the variables of different groups, corresponding, for example, 
to interchanges between the first and the third electron, or the first and 
the (2i+l)th one. If we now apply such permutations (P 4 ) to the 
function (333) and add the results, we can obtain a function 

U = \PlUx)*<m (333a) 

which will be antisymmetrical with regard to all the electrons, i.e. all 
the coordinate quadruplets. Permutations of this class can hardly be 
defined explicitly for the general case (arbitrary values of i and j). 
They can, however, be specified unambiguously by certain simple con¬ 
ditions which we shall not consider here. 

The antisymmetrical wave functions (333 a) can also be obtained by 
starting from ‘spinless’ functions of the type <£,;(#) symmetrical with re¬ 
gard to i pairs of electrons and antisymmetrical w ith regard to j other 
pairs, while antisymmetrical with respect to all the other n— 2(i-fjf) 
electrons. The complementary spin factors e(£) should reduce in this 
case to a product of i factors u,j factors v , and n—2(i+j) — 2 \m ] factors 
The permutations P {j which must be applied to the products 
in order to obtain the functions 

£) = 2 Pi,[4>iA x ) € i *(£)]> 

li} 

identical with those defined by (333 a), will constitute a broader class 
than the permutations P { . In fact, they can be defined as the products 
of the latter and of the permutations which must be applied to the 
spin functions 

£21+2) *•* V (^2(i+jf)-l> £2(£+j))^/»'£ t (i f 

in order to obtain upon addition the symmetrical function (332 b). 

In constructing the functions (333 a) we have left out of account the con¬ 
dition that they must satisfy the ‘spinless’ Schrddinger condition. Now 
it is easily seen that this condition is fulfilled so long as it is fulfilled 
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for the space factor <^(rr) in the initially chosen function (333). Apply¬ 
ing to the equation K^^x) = K i any permutation P it we have 
indeed, since K is symmetrical with regard to all the electrons and K i 
is a pure number, 

Wi W] = m M*)] = kip, 4,,(x)\. 

This shows that if 4>i(x) is a characteristic function of the operator 
belonging to a certain characteristic value (energy-level) K it then all 
the functions resulting from it by permuting the electrons will also be 
characteristic functions, belonging to the same energy-level. This being 
so, any linear combination of such functions will have the same pro¬ 
perty, which therefore will be shared by the unique combination (333 a) 
satisfying the antisymmetry condition (the factors i^[c- ; -(£)], which are 
equal either to ±1 or to t), playing the role of ordinary coefficients 
with regard to the functions P,[</> 4 (x)]). 

It remains to be seen whether the equation K<j> = K'<j> actually has 
solutions of the type i.e. antisymmetrieal with regard to all the 
n electrons (i ~ 0), or symmetrical with regard to one pair (1, 2), and 
antisymmetrieal with regard to the rest (i --- 1), or symmetrical with 
regard to two pairs [(1,2), (3,4)], and antisymmetrieal with regard to 
the rest (i = 2), and so on. A rigorous proof of this existence theorem 
is not easy and we shall not stop to give it. The following remarks are 
worth mentioning, however, in this connexion: 

1. The functions </>; defined above (or their linear combinations) are 
not the only characteristic functions of a symmetrical operator K\ the 
latter has besides, a number of characteristic functions with an entirely 
different symmetry character—for instance, symmetrical with regard 
to all the n coordinate triplets or antisymmetrieal with regard to two 
or three of them, and symmetrical with regard to the rest, and so on. 
Such solutions, although they exist mathematically, are non-existent 
physically, i.e. they do not correspond to any real phenomenon, for 
they cannot provide a basis for constructing functions antisymmetrieal 
with regard to all the coordinate quadruplets x k , C k . The fact that such 
a basis is provided only by functions of the type <^( x) is a consequence 
of the two-valuedness of the spin quantum numbers m k of the individual 
electrons, this two-valuedness determining the symmetry type of the 
‘spin-factors* e(£) and thence indirectly the symmetry type of the 
associated space-factors <f>(x), 

2. The functions (f> } (x) (or their linear combinations) corresponding 
to different values of i belong in general to different characteristic 
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values K i of the energy operator. They can be introduced as ‘non- 

( h d\ 

K-\- — —\(f> — 0 in that case 
2m ot ] 

also when K contains the time explicitly (i.e. when the electrons are 
supposed to move under the influence of a variable field of some external 
origin). In this case the symmetry character of <f> remains a permanent 
property, if no difference is made between various linear combinations 
of the functions PJ^z)] with the same value of i, the permanence of 
the antisymmetry character of </> 0 being a particular case of this theorem 
(the latter holds likewise for a number of solutions belonging to other 
symmetry classes, not realized in nature). 

It will be convenient in the sequel to replace the numbers i and j, 
which specify the functions (333) or (333 a) by two other numbers, 

8 = \{n— 2t) = \n—i, (334) 

n 

and m — 2 m k = ±(i w ~ ; j)> (334a) 

A:^l 

The latter can obviously be interpreted as the component of the result¬ 
ing spin angular momentum of all the electrons along the z -axis (in 
hj27T units); in fact it is equal to the algebraic sum of the characteristic 
values of the matrices \a zk for the individual electrons. For a given 
value of s, m can assume 2«s+1 values differing from each other by 1 
and lying between -\-s and ■—s. This circumstance suggests the inter¬ 
pretation of s as the magnitude of the vector specifyi?ig the resulting spin 
of all the electrons (irrespective of its direction). The characteristic 
value of the square of this total spin is equal to the product of (hj2n) 2 
with s(s J r l )—just as in the case of the resulting ‘orbital’ momentum, 
defined by the number j (see § 37). 

The above interpretation of the number 5 is also supported by the 
fact that its maximum value is equal to \n, which corresponds to the 
same direction of the spin vectors a k of the separate electrons. It 
thus appears that the resulting spin associated with a given solution <f>i 
of the ‘spinless’ SchrOdinger equation is equal to one-half of the number 
of electrons with regard to which this,function is antisymmetrical. 

We shall now consider, for the sake of illustration, the special cases 
of systems consisting of two and three electrons, a helium and a lithium 
atom, say. In the first case we get functions <f>i(x) of two types only, 
namely, the antisymmetrical one <f> 0 (x) = <f> 0 ( x v x 2 ) and the symmetri¬ 
cal faix) — <f>i(x v x 2 ) (following Heitler and London, we introduce lines 
under or over the neighbouring variables, to indicate the antisym- 
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metrical or symmetrical character of the wave function with regard 
to these variables). Taking further the four combinations of the indi¬ 
vidual spin quantum numbers m x and m 2 , namely, (—|, — |), (—|, + J), 
(+i» — £)> (+i> +£)> we can form three symmetrical spin functions, 

V{ivh) = S-i.f.S-U,. 8 -J.fAf,+ 8 -U:;, 8 h.{.> 8 uA{.’ 

and one antisymmetrical 

The products of the former with the antisymmetrical space function 
<f> 0 ( x 1 ,x 2 ) define three states, corresponding to the same resulting spin 
5=1 (parallel orientation of the two electrons) and to the values 
m = —1,0, +1 of its projection on the z-axis, whereas the product of 
u( £ v £ 2 ) with <f> 1 (x 1 ,x 2 ) defines a single state corresponding to s = 0 and 
m = 0 (‘anti-parallel’ orientations of the spins). 

In the case of three electrons we must distinguish likewise two types 
of ‘spinless’ functions, namely, those antisymmetrical with regard to 
all the three electrons cf> 0 (x) = <f>o(x v x 2i x 3 ), and those symmetrical with 
regard to two of them, <£j( x ) = <f>i(x l9 x 2 , x 3 ), say (the third electron, being 
alone, does not require any specific condition with respect to symmetry). 

The functions of the first type must be combined with a symmetrical 
spin factor €(£*, £ 2 , £ 3 ) which can be obtained either in th6 form 

£ = 8 m'i, 8 m'{, 8 m'{,. 

if m x = m 2 — ra 3 = m! = it 2 (2 m k = zb1), or in the form 

e = ?T^r 2 )S m ^ 3 4- 4- 

if one of the numbers rn k is different from the two others (J m k = ±i). 
We thus get a ‘quadruplet’, i.e. four states with the same s — § and con¬ 
sequently with the same value of the energy K ~ K 0> which are dis¬ 
tinguished from each other by the values of the resulting ‘axial’ spin 
numbers rn = —f, — |. 

The functions of the second type, </>i(x v x 2 , x 3 ), must be combined 
with spin factors of the form 

and summed over the cyclic permutations of all the three electrons, 
giving two antisymmetrical functions, 

Q = ? 3 M CiA ) 8 m't.+A x 3. x i> 

+ x »> x lM£j>_£j) 8 m’{ 1 > 

for two different values of m'; the states defined by them belong to the 
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same value l of s and to the same energy K — K v forming what is 
called a ‘spin doublet’ of a similar type to that for a single electron. 
The antisymmetrical character of the functions tp(x, f) is clearly seen 
from the fact that if two electrons, the first and the second, say, are 
interchanged, the first term changes its sign, whereas the second and 
third are transformed into each other with opposite signs. It should 
be mentioned that the normal state of a lithium atom, constituted by 
two equivalent inner electrons, forming its ‘core’, and one ‘valence’ 
electron, must be described by a wave function of the above type. 
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REDUCTION OF THE PROBLEM OF A SYSTEM OF 
IDENTICAL PARTICLES TO THAT OF A SINGLE PARTICLE 

41. Perturbation Theory of a System of Spinless Electrons and 
the Exchange Degeneracy 

Further progress in the study of the problem of many electrons can be 
achieved only if we describe their motion in a way similar to that used 
in Bohr’s theory of complex atoms, namely, by assigning to each 
electron an individual state of motion in a given field of force. The 
mutual action of the electrons can be partially accounted for by intro¬ 
ducing some constants like the screening constants, in the definition of 
the appropriate field of force for each electron, or by using the same 
suitably chosen field of force for all of them—a self-consistent field, for 
example (see below). The problem of the motion of the whole system is 
thus reduced to that of the motion of the separate particles constituting 
it and to the determination of the effective external field which can 
approximately represent their mutual action. Inasmuch as this mutual 
action is accounted for inexactly, we can obtain a better approximation 
by treating it, or that part of it which was not included to begin with 
in the effective field of force, as a small perturbation, and approach 
the. exact solution by the methods of the perturbation theory, starting 
with the solution which corresponds to a distribution of the electrons 
between various individual states of motion (or ‘orbits’). 

A characteristic distinction between Bohr’s theory and the new 
quantum theory in connexion with this perturbation problem consists 
in the fact that the electrons must be interchanged between all the individual 
orbits in such a way as to be completely stripped of their individuality. 
This result which is expressed by the symmetry principle for the 
probability density or the antisymmetry principle for the proba¬ 
bility amplitude t/r can be shown to be in harmony with the principles of 
the perturbation theory applied to the problem of a system of identical 
particles. 

The wave function <f> describing their motion can be represented to 
begin with as the product of the functions ^(aq), *Pn( x n ) describ¬ 

ing the behaviour of the individual electrons in the given external field 
of force. Putting ^ — ^ 1 (x 1 )i/j !l (x 2 )...ifi n (x n ) (335) 

and denoting by P<j> the function into which <f> is transformed when 
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the permutation P is applied to the electrons, we can represent the 
general solution of our undisturbed problem, belonging to the same 
energy as <j>(x) by the expression 

x(z) = 1C p P<{>, (335 a) 

p 

where C P are arbitrary coefficients, the sum being extended over all 
the possible permutations, or at least over the ‘effectively different’ 
ones, i.e. such as lead to different functions P<f>. 

If all the n individual wave functions t/q,i/» 2 ,..., are different, every 
one of the n\ possible permutations P will be associated with a specific 
function P<f> . In the contrary case the permutations P can be sub¬ 
divided into separate sets of equivalent permutations, which correspond 
to identical functions P(f>, and in writing down (335 a) we shall have 
to consider only one representative of each set. 

We shall assume for the sake of simplicity that apart from this 
‘exchange degeneracy’, arising from the possibility of interchanging 
the electrons between different individual states without altering the 
total energy, no other type of degeneracy need be considered. 

We shall disregard in this section the spin effects and treat the 
electrons as spinless particles, using for the determination of their 
motion the ordinary Schrodinger theory. We shall leave aside further¬ 
more the question as to the symmetry of the functions x( x ) an( l shall 
try to determine the coefficients C P by which they are defined in such 
a way as to ensure the approximate validity of the expression (335 a) 
when the perturbing forces (i.e. the mutual action of the electrons or 
the neglected part of this mutual action) are taken into account. In 
this case the function (335 a) is said to be ‘stabilized’ for the perturba¬ 
tion. It is meant by this that if the approximation is pushed further, 
the coefficients C P will suffer but a slight variation. This question has 
been considered in its most general form in the perturbation theory of 
degenerate systems. As has been shown there, the degenerate set of 
states specified by the functions P(f> gives rise to the same number 
of states belonging in general to different energy-levels H ' and specified 
by the values of the coefficients C P which satisfy the system of 
equations ]T H P Q C Q = H'C P , (336) 

where H P Q are the matrix elements of the total energy with regard to 
the approximate functions P(f> and Q<j>: 

H PX/ = J P<f>*HQ<t> dV, 

Q denoting, as well as P, a permutation of the electrons. 

3505,8 3 F 


(336 a) 
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In writing down the equations (336) we are tacitly assuming that the 
different functions Pf> are mutually orthogonal. This assumption is 
easily seen to be verified if the functions f*i(x n ) describing the different 
individual states are orthogonal with regard to each other. Now the 
mutual orthogonality of the individual functions is automatically 
secured if they represent different stationary states of an electron in 
a given external field— the same for all the n electrons. In many actual 
problems it is more convenient, however, to assign to each electron 
a specific field of force (for instance, a Coulomb field, characterized 
by a specific value of the screening constant in the problem of the 
distribution of electrons in a heavy atom), in which case the individual 
wave functions can no longer be considered as mutually orthogonal. 

The equations (336) must be replaced in this case according to (61), 
§ 9, by the following ones: 

2 (H PtQ -H'J PtQ )C Q = 0 , (337) 

Q 

where J,, Q = J P<f>*Q<f>dV. (337 a) 


The value of this integral must obviously remain unaltered if the 
integration variables are replaced by any other ones (which amounts 
simply to a change of notation). We can, in particular, interchange 
them in a manner corresponding to an arbitrary permutation R of the 
electrons. The functions Pf>* and Q<f> will be replaced accordingly by 
RPf* and RQ<f>, so that we shall get 

J P , Q = J RP<f>*RQ<f> dV = J RP<RQ . 


It should be noticed that the permutation R must not be applied to 
the functions <f >* and the result 


J PR<f>*QR</> dV 


JpR,QR 


being in general quite different from the preceding one. 

If, in particular, R is identified with the reciprocal of Q (R = Q- 1 ), 


we get 


J P,Q 


J, 


Q-lp, 


(338) 


where J s is an abbreviation for J SI , I denoting the identical per¬ 
mutation == <j>). 

We get likewise, because of the symmetry of the energy operator H 
with regard to all the electrons, 


Hp,Q ~~ Hrp,rq 

and in particular H P Q — H Q - i P , (338 a) 

H I{ being an abbreviation for H E1 . 
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The relations J QP = J* Q and H Q P = can be written accord¬ 
ingly in the following form: 

Jr' 1 ~ Jr> Hr-i ~ H% (338 b) 

where R = (2 _1 P and P _1 = P" 1 #. We thus see that the number of 
different matrix elements H PQ and J PQ is actually reduced to the 
number, g say, of different states P</> instead of being equal to its 
square g 2 . 

The equations (337) can be rewritten as follows: 

2 (Hr~-H'Jr)C pr -\ = 0, (339) 

R 

the summation over all the permutations R being obviously equivalent 
to the original summation over the permutations Q, with a fixed per¬ 
mutation P, the latter specifying each of the g equations forming our 
system. The perturbed values of the energy H' are determined as the 
roots of the determinantal equation 

\H q -i p -H'J q -i p \ = 0, (339 b) 

which expresses the condition of their compatibility. 

Two types of solution of our perturbation problem are immediately 
obtained from the equations (339)—namely, those which correspond 
to the symmetrical and to the antisymmetrical functions %. In the 
former case all the coefficients C Q are equal, so that they cancel out 
and the equations (339) reduce to the single equation 

2 (Hr H Jr) — 0, 

H 

which serves for the determination of the energy 

H gym — 2 Hr/ 2 Jr- (340) 

R R 

In the latter case the coefficients C Q are defined by the formula 
C Q = €q C, where c Q — +1 for even permutations (equivalent to an 
even number of transpositions) and = — 1 for odd ones. Since in this 
case C PR — e P € Je C, the g equations (339) again reduce to the single 
equation | e lt (H R -H'J R ) = 0, 

whence H'anUsym = 2 H Kj 2 e R J R- ( 340 ») 

One might be tempted to look for more general solutions of (339) by 
assuming that C PQ = const. C p Cq, or C P — const. e 1 ’®*. It can easily 
be shown, however, in the same way as in Tart I, § 22, that this assump¬ 
tion leads to symmetrical and antisymmetrical functions only. The 
symmetry properties of all the other solutions can be determined by 
the following method due to Dirac. 
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According to Dirac, permutations can be dealt with in exactly the 
same way as ordinary linear operators which serve to represent various 
physical quantities. They can, in fact, be multiplied by each other, the 
product being in general non-commutative, i.e. depending upon the 
order of the factors, but satisfying the associative law (just as in 
the case of differential or matrix operators investigated hitherto). 
It is possible further to define the sum, of two or more permutations as 
an operator, which without being itself a permutation is equivalent to 
them in the sense of the distributive law: 


(P 1 +P # )J’ = P 1 J f +P f *\ 


where F denotes any other operator or function. 
To each permutation n 0 v 

MU::!)- 

there corresponds the reciprocal permutation 


p-l /&i>&2>•••>&*? 

( 1 , 2 ,..., nj’ 


whose product with P, irrespective of the order of the two factors, is 
equal to 1, i.e. is equivalent to the ‘identical’ permutation 


1 = 



Every permutation P can be represented as a product of ‘cyclic’ per¬ 
mutations, of the type 



,2,3,4), 


where each element in the brackets () is replaced by the next, the last 
one being replaced by the first. The different cycles into which P is 
thus factorized must have no common elements; they can be therefore 
commuted with each other without changing the result. We have 
for example, 


1 2 3 4 5 6 7 8 9) 
7 5 4 2 3 9 1 8 6) 


(1, 7)(2, 5,3, 4)(6, 9)(8), 


the two-element cycles (1,7) and (6,9) being simply transpositions (i.e. 
interchanges of two elements), while the one-element cycle (8) denotes 
that the corresponding element is not affected by the permutation 
considered. 

Permutations which can be factorized into the same number of cycles 
with the same number of elements (which may be different for different 
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permutations) are called ‘similar’ and form a ‘class’ specified by the 
‘partition’ of the number n into summands giving the number of ele¬ 
ments in each cycle. The partition for the above permutation is 

n = 1 +2+2+4. 

Similar permutations P and Q can thus bo obtained from each other 
by permuting the elements appearing in the cycles of one of them. 
Denoting by R the permutation which must be carried out in the 
cycles of P in order to obtain Q , we get 

Q = RPR-K 

The factor R~ l accounts for the fact that the permutation R should 
not affect the operator or function to which P or Q is supposed to 
be applied (RPF would be equivalent to applying the permutation R 
both to P and to F). 

Since every permutation P commutes with the energy operator H (H 
being symmetrical with respect to all the electrons), it can be treated 
as a constant of the motion. The fact that the different permutations 
do not in general commute with each other shows that it is impossible 
to assign simultaneously definite values to all these constants. It is 
possible, however, to combine them linearly into a set of commutable 
operators, which can be constructed by adding together all the permu¬ 
tations belonging to the same class. With a fixed P and a variable R 
each permutation Q = RPR~ l will be obtained several times—namely, 
n\!n k , where n k is the number of different permutations in the class 
under consideration. The sum of all such permutations, or their 
‘average’ 1 „ 

P = - V RPR-\ 

n\ 

it 

will obviously commute with all the permutations. We have in fact 

TPT ~ 1 = i- T TRPR-'T- 1 
nl 

u 

or putting TR = S and R-^T- 1 == S~ l , 

TPT - 1 = ~ y SPS - 1 = P 
n\ 

IS 

(since for a fixed T and a variable R the product TR varies over the 
same rangje as R). Hence TP = PT. It follows in particular that the 
operators F k referring to different classes (k = 1, 2,...) commute with 
each other. Since, moreover, they commute with the energy operator 
Hy they can be considered as defining a set of independent constants 
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of the motion whose characteristic values P' k can be determined simul¬ 
taneously and can serve, together with the characteristic values of the 
energy H', to specify the stationary states of the system. 

The characteristic values of the operators P are obviously wholly 
independent of the form of the energy operator (so long as it is sym¬ 
metrical between all the electrons). They must be connected therefore 
with the symmetry properties of the wave functions xh' which belong 
to them and can serve for the classification of the latter. 

It should be noticed that the operators P preserve their role of con¬ 
stants of the motion in the general case of an energy operator containing 
the time explicitly. This means that if the wave function x satisfying 


Schrbdinger’s equation 


AJL 

2ni dx X 


== Hx has at the initial moment 


t 0 a definite symmetry type, specified by certain characteristic 
values of the operators P' y it will maintain the same symmetry type at 
any other time. The same results can be expressed by saying that the 
stationary states of an unperturbed system belonging to different charac¬ 
teristic values of the permutation operators P do not combine with 
each other under any perturbation (symmetrical in all the electrons). 

The simplest examples of this theorem are provided by the sym¬ 
metrical and the antisymmetrical wave functions. The characteristic 
values of the P are equal to +1 for the former and to ± 1 for the latter 
(+1 for even permutations and —1 for odd ones). 

So long as the spin effects are left out of account we have to consider 
symmetrical and antisymmetrical functions only; if, however, the spin 
effect is allowed for, spinless functions of a more complicated character 
have to be admitted; to each set of characteristic P- values there 
corresponds in general not one but many w ? avc functions of the same 
symmetry type (cf. Part I, § 22). If, moreover, the spin forces are taken 
into account (as a small perturbation), the states corresponding to 
different P-values will combine with each other. We thus get rather 
complicated results which can, however, be reduced to the original 
simple form if the spin coordinates are introduced in the definition of 
the wave functions on the same footing as the geometrical ones. 

If the electrons are associated with different individual states specified 
by mutually orthogonal wave functions, the set of functions P<f> can be 
replaced by the set P^$ obtained from <f> — ^( x i)tfh{ x z)’'-'}*n{ x n) by 
applying the different permutations P not to the arguments of the 
functions i/j but to their indices , that is , by permuting not the electrons 
between the given states , but on the contrary the different states between the 
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electrons. Since by applying the same permutation P both to the argu¬ 
ments and to the indices, we obviously do not change the resulting 
factorized function, we can put 

PxPijj — PifjPx — 

where the suffix x has been added to indicate explicitly that P is applied 
to the electrons. We thus see that P^ plays the same role as the 
reciprocal of P x and vice versa. 

Taking the matrix elements of the energy with respect to the new 
functions H/% and remembering that they are invariant with regard 
to any permutations R of the electrons (i.e. of the integration variables), 
we have 

11% = | P^HQ^dV = R x j P^HQ^dV 

= J P+R^HQ+R^dV, 

(since we must first permute the integration variables in <f>* and (f> and 
thereafter only carry out the permutations P^ Q^ of the indices). The 
functions R x <f> * and R x </> can further be replaced by and R^<f>, 

the permutation R x applied to the arguments of any factorized func¬ 
tion (j> being equivalent to the reciprocal permutation applied to the 
indices. We thus get 

= /P, R^^lKlf, R+'+ dV = Hf R ^ QR -K (341) 

With R = Q this reduces to 

(341a) 

where dPf) is an abbreviation for The difference between this 

result and the expression (338 a) for the matrix element of H with 
respect to the original functions P x <f> and Q x <f) consists only "in the order 
in which the two permutations P and Q ~ l must be multiplied by each 
other. We shall presently see that thanks to this difference it is 
possible to reduce our perturbation problem to a simpler form, 
corresponding to the replacement of the energy operator H by the 
equivalent ‘permutation operator’ 

If = I BfR*. (342) 

H 

The fact that the two operators are equivalent so far as the first 
approximation equations (330) are concerned is proved by comparing 
the matrix elements of W and H with respect to the functions 
We have, namely, 

w% = | Hf J P^Rj, Q^dV, 
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which in view of the orthogonality and normalizing conditions for the 
functions P^ <f> reduces to 

wf.Q = h pq 1 (RQ = P). 

that is, to according to (341). 

A similar result cannot be obtained with the wave functions P x if* which 
have been used before, for with W defined by the formula W = 2 A n R x 
we get W pq =A pq - i. There can, however, be no correspondence 
between this expression and the matrix element H P Q — H Q - i P for 
the two permutations PQ _1 and Q~ X P are in general quite different. 

The form of the energy operator H has been left hitherto quite 
arbitrary (apart from its symmetry with respect to all the electrons). 
Now in all actual problems H can be written down in the form 

H = 2 E(x i ,p i )+ 2 2 F ( x i’ x k), ( 343 ) 

i i<k 

where the first term represents the sum of the energies of the separate 
electrons, supposed to move independently, while the second term is 

e 2 

equal to their interaction energy, so that F(x i , x k ) — .- (r being 

r ( x i> x k) 

the distance apart between the ith and the fcth electrons).—It should 
be emphasized that in writing down the expression (343) we must not 
consider the energy E(x i ,p i ) as corresponding to the approximate de¬ 
scription of the motion by means of the individual wave function 
ifs The latter can correspond to a somewhat different energy 
operator E^x^Pi) involving some additional terms which serve to 
account in a simplified way for the mutual action of the fc'th electron 
with the rest—by an adequately chosen value of the 'screening con¬ 
stant’ in the case of a complex atom, or by some type of 'self-con¬ 
sistent’ field. The difference 

s = H— 2 E t( x i>Pi) (343 a) 

i 

can be defined as the perturbation energy. In order to obtain by our 
perturbation method a good approximation to the truth we must 
adequately determine the ‘effective’ energy operators E t for the in¬ 
dividual electrons in such a way that the matrix elements of the per¬ 
turbation energy S should be as small as possible. We shall come back 
to this question in § 43. We are interested here only in the specialization 
of our general theory for the actual case of an energy operator of the 
form (343). 

We shall assume for the sake of simplicity the functions ^ and 
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consequently P<f> to be mutually orthogonal (and of course normalized 
to 1). The matrix element E n of the energy E(x i ,p i ) defined by the 
general formula 

E h = j R<f>*E(x ( ,p i )<f> dX (dX = dx 1 ...dx n ) 

is then easily seen to vanish for all the permutations R except the 
identical one, in which case it reduces to 

Ei = / 'PfE(x i ,p i )<p i dx { , (344) 

that is, to the average value of the energy of the ith electron with 
regard to the external field alone for the state of motion which was 
initially assigned to it. It should be kept in mind that this motion, 
inasmuch as it is described by the approximate energy operator 
Ei(?i,Pi) which contains some additional external field more or less 
equivalent to the mutual action of the ith electron with the rest, differs 
from the motion described by the operator E(x i ,p i ), and that accord¬ 
ingly the energy E i is in general different from the characteristic value 
E\ of the energy corresponding to the wave function 

Taking the matrix element Fr of the interaction energy F(x i , x k ) y 

F K = f R+'F^xJ+dV, 

we easily see that it does not vanish in two cases only, namely, in the 
case of the identical permutation, when it reduces to 

Fik = J7 +i(ZiWk( x k) F ( x i’ x k)M x i)'Pk( x k) dx.idx k , (344a) 

and in that of a transposition R — T ik involving the interchange be¬ 
tween the ith and kth electrons. We shall denote its value for this case 
by G ik , where 

Oik = JJ 'i>i( x k)4>t( x i)F{ x i, x kW x i)<l>k( x k) dx < dx k- (344 b) 

All the other matrix elements of E and F, and consequently all the 
coefficients H R for such permutations R which are different from the 
identical permutation or from a transposition vanish. 

It should be noted that we obtain the same expressions for the matrix 
elements of E and F with respect to the wave functions The 

identification of the integration variables in (344 a) and (344 b) with 
the coordinates of the tth and the kth electrons is irrelevant for the 
value of F ik and G ik , this value being determined by the states to which 
the two electrons are referred, and not by the individuality of these 

85W.6 3 q 
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electrons. We could therefore write 

F ik = Ft = JJ 4>nF)r k (F')F(F,x")MF)U^dx'dx* 

and G tk = Gt = JJ 4>Ux'm(x")F(x’, aOMOM*') dx'dx", 

leaving the indices of the two electrons unspecified. 

The permutation operator W is thus reduced in all actual problems 
to the relatively simple form 

w — w°+ y Q ik Tf k , (345) 

i<k 

where W° = 2 E t + £ F ik (345 a) 

i i<k 

can be defined as the approximate value of the energy of the system 
under consideration, the second term in (343) representing the operator 
of the ‘exchange’ energy. 


42. Introduction of the Spin Coordinates and Solution of the 

Perturbation Problem with Antisymmetrical Wave Functions 

The results of the preceding section cannot be directly applied to the 
general problem of the motion of a system of electrons, for this implies 
the introduction of the spin coordinates which have been ignored 
hitherto. Even if we neglect the spin forces—which we shall always do 
in the sequel—we must take into account the spin coordinates and the 
spin quantum numbers in order to set up the antis 3 mimetrical wave 
functions which describe a system of electrons. 

We shall consider here the problem of the approximate determination 
of the antisymmetrical wave functions with spin, which belong to a 
spinless energy operator //, with the help of the individual wave func¬ 
tions £) describing the motion of the separate electrons in a given 

external field (£ denotes the additional spin coordinate and the index 
i is supposed to contain the spin quantum number). 

This problem admits at first sight a simple and unique solution 
expressed by the determinant 



■ • 4>i(x n 

,(») 





since no other wave functions but the antisymmetrical one need be 
taken into account in connexion with the exchange phenomenon. 

The simplification with regard to the exchange degeneracy intro¬ 
duced by the antisymmetry condition is, however, balanced by the addi¬ 
tional degeneracy, due to the possibility of assigning to each electron two 
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different spin-states connected with the same type of orbital motion and 
corresponding to the same value of the energy. We thus get for the 
whole system of n electrons, distributed between n ‘orbits’, i.e. spinless 
states, which can be specified by certain functions of the geometrical 
coordinates alone ip n (x), a degenerate set of 2 n states 

differing from each other by the spin quantum numbers m v ra 2 ,..., m n , 
associated with each spinless state. 

The individual states with spin can be described by the functions 

*(*,*) = *<*)*, mt> (3«a) 

where m and f assume the values J and —- being equal to 1 for 

£ = m, and to 0 for £ ^ m (it should be remembered that m denotes 
the characteristic value of the component of the spin-matrix a along 
some fixed axis). 

The spinless functions ifj^x) need not be all different; they can occur 
in pairs, under the condition that the associated spin quantum numbers 
m i are different. Instead of four degenerate states we get for each such 
pair only two, so that the total number of degenerate states of the 
whole system is equal to 2 n '+ n * = g, where n' is the number of singly 
occupied spinless states and n" the number of doubly occupied spinless 
states (n = n'-\-2n"). 

In the absence of any other degeneracy except the spin one [and the 
exchange degeneracy which is taken care of by using as zero approxima¬ 
tion the antisymmetrical function (346)], the problem of determining 
to the first approximation the wave functions with spin x( x >£) corre¬ 
sponding to the spinless energy operator H can be solved by defining 
these functions as linear combinations of g functions of the type (346), 


x(*.e = ic a <D a) (347) 

ac-1 

where the coefficients C a satisfy the system of g equations, 

I = 0 (« = 1.2(347 a) 

a 

under the compatibility condition 

\H a p-H'J a p\ = 0 (347 b) 

which serves for the determination of the energy-levels H'. 

The matrix elements and must be defined here by the 
expressions | J * 


( 348 ) 
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where denotes a summation over the spin coordinates of all the 
£ 

n electrons involved in the functions O. 

Taking into account the relation 

SS,n;('S mi C = S ui;mi (348 a) 

which follows from the definition of the symbols 8 (where refers 
to one particular electron), we can easily find that the matrix elements 
(348) can be, different from zero only if the functions O a and are 
associated with the same value of the resulting spin component 

m ~ 2 m v (348 b) 

j i 

In fact, H a p and J^ can be expressed as a sum of terms each involving 
a product of n factors of the type (348 a). Now unless the two states 
a and are associated with an equal number of spins pointing in the 
same direction, i.e. specified by spin quantum numbers m t having the 
same value (J or — |), one at least of these n factors will vanish in 
each such term. 

We thus see that the functions can be divided into a number of 
non-combining groups belonging to different characteristic values of the 
total spin component m of all the electrons along a certain axis, z say. 
This result is a direct corollary of the fact that the spinless energy 
operator H commutes with each of the spin matrices cr zi and consequently 
with their sum n 

°zi■ 
i = 1 

Now this means that the matrix of H is diagonal with respect to m. 
We have in fact (leaving other variables out of account) 

(II<7 z o z H) m m * ^ ^zmni" M ) =■ 0 , 

m 

whence it follows that H mm * -- 0 unless m' ~~ m". 

The subdivision of the function into groups belonging to the same 
value of m greatly simplifies the perturbation problem under considera¬ 
tion, for the g equations (347 a) are split up hereby into a number of 
separate systems, containing coefficients which refer to functions of 
the same group only. The function x( x > 0 stabilized for the perturba¬ 
tion will belong accordingly to a definite characteristic value m of a 8 
specifying the corresponding group. The equations (347), (347 a), and 
(347 b) will be understood in the sequel to refer to one particular group 
of g states with the same value of m. 
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If all the spinless states ip n are different, the number g is given 
by the formula [C r n is the usual binomial coefficient n\j{r\(n—r)\}] 

g(m) = (349) 

In fact the number of ways in which n + positive and n_ negative spins 
can be associated with the n different orbits is obviously equal to 
Cn + == C%~, which reduces to (349) since 

m = %(n + —n_)> (349 a) 

that is, ‘ n ± = w±2m. (349 b) 

The sum £ 9( m ) taken for all values of m from — \n to \n is equal to 

n 

2 ” 2 n , as of course it should be. 

»+. =o 

The g(m) functions forming a certain group can be obtained from 
one of them O by permuting the spin quantum numbers m v ra 2 ,..., m n 
associated with the separate orbits between the latter, with the con¬ 
dition that identical orbits—if present—should always be associated 
with opposite spins. Such permutations P must be distinguished from 
those which we have considered before and which referred either to the 
distribution of the electrons between the (spinless) states or of the states 
between the electrons. 

Just as before, however, it can be concluded from this circumstance 
that the number of different matrix elements H aj9 and is reduced 
from g ^ to g m . We shall not stop to investigate this question, for, as 
has been shown by Slater, all we need to know are the diagonal elements 
of the energy, from which the perturbed energy-levels can easily be 
computed without directly solving the perturbation equations (347 a). 

The diagonal elements of H are easily seen to have the same value, 
H(m) say, for all the g(m) functions O. If the individual wave functions 
(with spin) ifi n are orthogonal and normalized to 1, i.e. if J a p = 8 a p 
(which we shall assume to be the case), then according to (347 b) the 
sum of the diagonal elements of H, that is, the product H(m)g(m), 
must be equal to the sum of the g(m) characteristic values of H 
belonging to m which are the roots of equation (347 b). Now whereas 
m, being the characteristic values of the projection a z of the resulting 
spin a on the direction of the arbitrarily chosen z-axis, depends upon 
the choice of its direction in space, the characteristic values of the 
energy must obviously be independent of the choice of this direction, 
being in fact invariant with respect to the rotations of the coordinate 
axes. They must be determined therefore by the characteristic values 
8 of the resulting spin itself, which are also invariant both with respect 
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to rotations of the coordinate axes and to the permutations of the 
electrons. 

So long as the forces due to the spin of the electrons (including the 
effects of their orientation in an external magnetic field) are neglected, 
all those states which belong to the same value of the resulting spin 
form a degenerate set, so that their energy is wholly determined by s. 
The number of such states f(s) and their energy H(s) can easily be 
calculated from g{m) and H(m) if we take into account the fact that, 
for a given m, s can assume the following values: 

s = \m\, |ra| + l,...,$n. 

Subdividing all the states belonging to a definite m into groups specified 
by different values of s, we thus get 

ff(m) = | f(s) (350) 

and g(m)H(m) = 2 f( s )H( s )- (350 a) 

S=|77l| 

The latter equation can be rewritten in the form 

§ /(«)#(«) 

H(m) = •’“’"L- , (350 b) 

i m 

s-\m\ 

which expresses the fact that the diagonal elements of the energy H 
are equal to the average value of the energy for all the states 
associated with the corresponding value of m. 

From (350) and (350 a) we obtain 

-f(s) = 9 ( 8 +l)~g(s) = &g(8) (351) 

and -f(s)H(s) - g(s+l)H(8+ 3 )-g(8)H(s) = &[g(s)H(s)] (351 a) 

whence H(,i) = (351b) 

Since g(8) is known, being determined by the equation (349) in the case 
of n different orbits, our problem reduces to the calculation of a diagonal 
element of H for a given value of m (= s). 

We shall take for the operator H the expression (343), i.e. 

H = J Eix^Pi) +5J; F(x 0 x k ), 

which is the only one occurring in practice. 

We shall further write one of the functions O defined by the deter- 
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minant (34G) in the form 

<t> = ^j 2^ Px u x)p t m ’ ' (352) 

where <f>(x) = 0i(*i)^ 2 (ff 2 )- •4n( x n) is the product of the spinless func¬ 
tions and 8,„(£) the product of the corresponding 

spin factors, e P being equal to 1 or — 1 for permutations of the even 
and odd type respectively (the permutations P x refer to the geometrical 
coordinates and Pg to the spin coordinates of the electrons). 

Let us consider the case when all the n orbits i/j p if/ n are 
different (and orthogonal to each other). The expression 

dx = J PAUQAdX 2 

which defines the diagonal matrix element of // with respect to the 
state <I> (or the corresponding average value) is then easily simplified. 
The integral I1 PQ ~ \ P x (f>IlQ x j> dX does not vanish, as we know, 
either when the permutations P and Q are identical (P = Q) or when 
they differ by a transposition T ik of any two electrons (Q — PT ik ). It 
reduces in the first case to // ; = JF° = 2^+22 ^ik anc * i* 1 

i K.fC 

second to G ik fcf. equations (344), (344a), (344b), and (345a) of the 
preceding section]. 

We have further, vdien P = Q, 

±J^P(*.F(S = fP f &.PtS= 1, 

since the total number of different permutations is just equal to n\. 

A little more care is required for the calculation of the preceding 
expressions when Q = PT ik . It is clear that the function 

8 = 8 rn,£, 

remains unaltered if the same permutation E is applied both to the 
spin coordinates ^ and to the spin quantum numbers m i (or more 
exactly, to the indices of these variables). Any permutation P% of the 
former can therefore be replaced by the reciprocal permutation P: iu l of 
the latter. We thus have 

|PfS.P£j48 = |p->8.P->7i8, 

where T\ k denotes, as before, the interchange of the coordinates ^ and 
£ fc , which in the original distribution were assigned to the zth and kth 
electrons* 
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Now in the function P m l $ these coordinates will be associated with 
the spin quantum numbers P~) = m it and — m k ,, where i' and k' 
are the numbers derived from i and fc by the permutation P 1 . In the 
function P m l T lk 8 the same coordinates will be associated with the spin 

quantum numbers m k , and m if respectively. The sum 2 Pm^-Pm^tk^ 

£ 

will obviously be equal to 1 if these two numbers are equal (+ \ or —\) 
and to 0 if they are different. 

Let us suppose that the numbers m v m 2 ,..., m n , are labelled in such 
a way that the first n + of them are equal to 1 and the last n_ to 
(n + +n_ = n). If now all the permutations P m * are applied to their 
indices, then each index will have an equal chance of being found at any 
place of the line, under the condition that two originally different 
indices will always have different places. 

The number of positions which any two indices corresponding 
originally to i and k can assume in the row of the n+ positive spins is 
obviously equal to n + (n+— 1), and in that of the negative spins to 
n_(n_— 1). The sum of these two numbers multiplied by (n— 2)! will 
give the total number of distributions (i.e. permutations P). We thus 
see that in the case Q = PT ik the expression 

8m,w 

P £ p 

is equal, irrespective of the choice of i and k , to 


n+(n+— l)+n_(n_— 1) 
n(n— 1) 


The expression (352 a) for the average value of the energy assumes 
accordingly the following form 



n + (n + —l)+n_(n_— 1) 
n(n— 1) 


22 °»- 


(353) 


where the negative sign corresponds to the fact that * P €q = —1 for 
two permutations differing from each other by a transposition (one of 
them being of the even and the other of the odd tj^pe). Writing W° 
for the sum of the first two terms and putting m = J(w + — nj) t i.e. 
n+ = n_ = \n—m, we can represent H as a function of m 

explicitly by the formula 


H(m) = W° 


£n 2 -)-2m 2 —n 
n(n— 1) 


22 

Kfc 


O u 


(353 a) 


As would be expected, this expression is a function of m alone, and is 
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independent of the choice of O out of the group belonging to a given 
ra, i.e. is the same for all diagonal elements of the energy matrix. We 
can now pass on to the calculation of the characteristic values of the 
energy as functions of the resulting spin s . 

In the first place we have, according to (351) in conjunction with 


(349), 


f(s) = C'i’ 1 +*-C , i"+ s + 1 = C*»+* 


23+ 1 

£»+3+l’ 


(354) 


Further, according to (351a) and (353 a), 


fW(s) 


c u + » + x^ 2 + 2 (' s + 1 ) 2 + w ] V V 


@ilci 


whence H{s) = J J G ik . (354a) 

n(n-l) 


This formula was originally derived by Heitler in connexion with the 
spin theory of chemical forces. The derivation given above is a 
modification of that given by Slater (in his theory of energy-levels in a 
complex atom) and by Pauli (in connexion with Heisenberg’s theory of 
f errom agnetism). 

Pauli’s method of dealing with the perturbation problem under con¬ 
sideration differs from that of Slater in the choice of the original wave 
functions with spin. Instead of taking the antisymmetrical functions 
defined by the determinant (346) we can use as the zero approximation, 
just as in the spinless case, the factorized functions obtained by multi¬ 
plying by each other the individual functions 

We shall slightly modify our previous notation by introducing the 
letters J ly J 2 ,..., J n to specify the different spinless orbits with which 
the separate electrons are associated and by writing (Ji\x k ) instead of 
*), an d ( m t|£fc) instead of 8 m{ £ k > The factorized function with which 
we must start can be obtained from one of them 


mm - W(w»ilfi)v-(«nlfJ = (^|*)(rn|f) (355) 

by permuting the different electrons, i.e. by applying the same permuta¬ 
tions P to the arguments x and and also taking the two possible 
values for each of the spin quantum numbers m t . Now, as has been 
shown before, only those functions (355) must be combined with each 
other which correspond to the same value of the sum £ = m and 

which accordingly can be obtained from each other by applying various 
permutations R (independent of P) to the indices The set of 

3S9S.6 3 j£ 
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degenerate states which must be taken into account for the construction 
of the wave function x(%> £) stabilized for the perturbation can thus be 
specified by the expression 

(J\Pz)(Bm\Pg) t (355 a) 

where P and R are arbitrary permutations. Since a permutation of 
the arguments (a*, £) is equivalent to the reciprocal permutation of the 
indices (*/, m), we can replace the preceding expression by 

or by <f* PiQ = (PJ\x)(Q?n\£), (355b) 

P and Q being independent of each other. 

The n\ different permutations Q actually lead to 

g(rn) = - Cl- - c?r 2m 

different spin factors (Qm\£) = (m\Q~ l £) which are distinguished from 
each other by the coordinates f 2 ,..., associated with the values m i = l 

and m i = — | respectively. In what follows we shall assume the per¬ 
mutations Q to be subdivided into g(m) classes, corresponding to the 
different functions ($m|f), and shall take for Q only one representative 
of each class, treating all the permutations of each class as identical. 
The function y(a;, f) can now be defined by the formula 

x( ,r >£) ~ 2 2 ^p,q < ! > p,0'> (356) 

p Q 

where the coefficients C PQ are determined by the equations 

2 ^ {Hp,Q; P',Q - H Jp,Q\P',Q')Cp',Q' — 0 (356 a) 

with 

Hp,Q;P',Q ' “ 2 j dV 

= I (Qrn\m'm\Z) J (x\PJ)H(P'J\x) dV 

and 

Jp,Q,P', Q ' = I f *P.Q+r. Q 'dV -1 (Qtn\()(<ym\() f (x\PJ)(P’J\x) dV. 

So long as we are considering effectively different permutations Q and 
Q' only we can assume the sums £ (Q m \£)(Q' m \£) to vanish except for 

t 

the case Q = Q' when they are equal to 1. The non-vanishing matrix 
elements of H and J thus reduce to 

H P Q. p' t Q’ ~ — H PP >~ 1 f JP,Q\ P\Q' ===: ^PP , ~ i f 

where H% P > and J PtP > are the usual matrix elements of 11 and J ~ 8 
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with regard to the spinless functions ( PJ\x) and ( P'J\x ). The equations 
(356 a) can therefore be rewritten in the form 

2 {H PP’- 1 — H'Jpp'~ l )Cjy q = 0, 

p' 

or, if we put PP'- 1 — R , 

^ p R -i P Q = 0. (356 b) 

We can now make use of the fact that the only functions (356) we 
need are the antisymmetrical ones. This means that *(&r, S£) = e s , 
where = 1 for a permutation S of even type and — 1 for one of 
odd type. Since the application of a permutation S or S~ 1 to the 
arguments x, £ of the functions (355 b) is equivalent to the application 
of the reciprocal permutation to the indices J, we get 

2 2 Cp,Q ~ € S 2 2 @P,Q $ P,Q » 

* Q 1~* Q 

or, replacing S~ ] P and S~ l Q in the first sum by P' and Q\ 

2 2 C>S]>’.SQ’ r=1 *s 2 2 Cp t Q <f>P,Q ~ *s 2 2 Cp' >Q ' <t>p\Q'y 

p' o' /> o F <7 

whence it follows that ^sp,sq = fc s'^ p.q- (357) 

This gives, if is replaced by (? (i.e. § by /S -1 #) and # by i? -1 , 
Cjr l p,Q — € r CpjtQ, 

so that the equations (350 b) can be rewritten in the form 

2 *R( H R- H ' J n)CpjiQ ~ (357 a) 

Li 

The index P is irrelevant, as it is the same for the whole system of 
equations and can therefore be left out of account. So far as the 
coefficients C P 1{Q ~ are concerned the summation over R can lead 
to g(m) different values only, which will be multiplied in equations 
(357 a) by the sum of the expressions for all the jR’s which 

correspond to equivalent permutations BQ. 

Putting as before H — 2 ^( :r ;>Pi)+ 2 2 F( x o x k) anc * assuming the 

i i<k 

functions (PJ\x) to be mutually orthogonal, we get 

{H t -H’)C Q - IJ G ik C TitQ = 0, (357 b) 

where I denotes the identical permutation, so that 
h, = ir° = 2 Ui+ 1 1 F itc, 

i i <« 

while T ik corresponds to an interchange between the spin quantum 
numbers m { and m k . 

The g(m) different coefficients C R can be specified unambiguously by 
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the indices of the n + electrons with a positive component of their spin 
along the z-axis. We can thus write (following Pauli) 

C Q = C(r v r 2) ... y r n+ ), 

where r v r 2 ,..., r n+ are the indices in question, C being independent of 
the order in which they appear. We can put in particular r ± — 1, 
r 2 = 2 ,..., r n+ = 7i+ without affecting the generality of our theory, since 
the choice of the permutation Q in the equations (357 b) is irrelevant 
for their solution. Putting accordingly 

C t*Q = T ik C Q = T ik C(r 1 ,r 2 ,...,r n+ ) = 
we can rewrite the equation (357 b) in the form 


(H I —H , )C(r l ,r i ,...,r n J— J 2 G ik T lk C(r 1 ,r i ,...,r nt ) = 0. (357 c) 


If we consider the determinant of these equations, whose roots give 
the allowed values of the energy H\ we see at once that the sum of 
these values for all the g(m) perturbed states is equal to the sum of the 
coefficients of C(r 1 ,r 2y ...,r n t ) (without of course the term H'), that is, 
to the expression ^ g . j 

r i<.k 

The summation % is extended over those pairs of states (or electrons) 
which interchange either two of the indices r v r 2 ,..., r n+ or two of the 
remaining indices, s v s n _ say (corresponding to negative spins), 
without interchanging any r with any whereas the summation 2 I s 

r 

extended over the g(m) different combinations of the r’s. As a result we 
obtain each O ik multiplied by the number of combinations for which 
the spins associated with the states i and k (or the ith and the ki h 
electrons) are both positive or both negative, i.e. 

0„- a -h0 n _ 2 - 0„---, 


Hj being multiplied by g{m)C^. We thus get for the average value 
of the energy H' of the g(m) perturbed states the expression 


U(m) - H r 


n + (n + —l)+ n_(n_— 1) y y Q 


n(n— 1) 


i<k 


iky 


which has been obtained before. 

As has been shown by Dirac, the transpositions T ik occurring in 
(357 c) can be replaced by operators, involving Pauli’s spin matrices 
a i and a k . Let us consider the scalar product of these spin vectors, 


t i.e. over those indices i, k which both occur either among the n+ indices r or among 
the n_ indices s. 
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that is, the operator applied to some function of a i and a k , and 
in the first place to o t - and a k themselves or their components along 
some axis, z say. We have, putting i — 1 and Jc = 2, 


( a l* a 2) cr ls — ( ar lx a 2x J r° r ly (7 2v~^ <7 lz (T 2z) <T lz 

— ( ct 1t <T lc) <7 2j: _ f" ( G ly a lz) G 2v "H^l* °1 z) a 2z> 

since the vectors a x and a 2 commute with each other, and further, in 
virtue of the relations (253), § 29, 


or 


— ^ U \y a 2x~^'^ <J lx ° r 2]/~i~ (J 2z 


(l+«l'0 2 Vl* = i ( a lx a 2v~ <7 lu a 2x) + f7 U+^2z = [*(«lX®2) + C l + °2]l- 

Similar expressions are obtained if is replaced by u Xx or a Xyy so that 
(l+o 1 *o 2 )a 1 = ia 1 Xa 2 +o 1 +o 2 . 


We get likewise . v , . 

(l + a!-a 2 )o a = io 2 Xo x +a x +o 2y 

and o 2 (l-\-a x a 2 ) = ^iXa 2 -f*o'i+ a 2’ 

whence (I + ai*a 2 )®i — ^(l+opa*). 

We have on the other hand 


(358) 


(o x -a 2 y — (a lx d^+tTiy (J 2y + G iz <*2zf 

“ a lx ° r £r“i~ a l l y a 2yH“°’j£ (7 2c"i _CJ lj: u 2x a ly (J 2v~\~ CT lv °2y a lx a 2x~^ 

== 3 + 2icr la *a Jte +... 

= 3—2a 1 -o 2 , 

and consequently (l+apaa) 2 — 4. (358 a) 

It follows from these equations that the spin operator 

0 X2 = l(l+o x a 2 ) (358 b) 

has the same properties with respect to any function of the spin 
variables a x , o 2 as the permutation operator T n - This becomes quite 
clear if we rewrite the equation (358) in the form 

or in the equivalent form 

0\2 a 2 ^12 “ a i> 
which reduces to 0 12 a 2 Of* 1 = a x 

in view of the equation 0\ 2 = 1 which corresponds to the relation 
T\ 2 = 1 (= identical permutation). 

The equivalence between 0 X2 and T 12 is preserved with regard to the 
functions of the other spin variables a 3 , o 4 , etc., since they commute 
with a x and a 2 , and further with regard to any function of the type 
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f(r v Tz,-’->r n ) since we can replace the indices r k of the electrons by the 
corresponding spin variables a k (or their squares). We can accordingly 
replace the permutation operators T ik in the system of equations (357 c) 
by the spin operators O ik (the fact that the sign need not be changed 
can easily be ascertained by considering a particular case). This system 
of equations can thus be written in the standard form of a wave equation 


where 


(W-H')C^ 0, 


i<k 

=2^+22 (F ac -\G ik )- 

i i<Jc 


12 2 Gik a i-°k 

i< \k 


(359) 
(359 a) 


is the approximate energy operator, which is equivalent to II as far as 
the first approximation of the perturbation theory is concerned. 

This result, due to Dirac, is very important both from the practical 
point of view r —for in many cases it enables one to calculate very easily 
the perturbed energy-levels—and from the theoretical point of view, 
for it shows that the ‘exchange energy’ in connexion with the anti - 
symmetry principle can be interpreted—in a purely formal way—as 
due to a fictitious kind of magnetism associated with the spin. In fact 


the expression 


= -Wik°i°k 


(359 b) 


can be considered as representing the energy of a fictitious magnetic 
interaction between the fth and Mh electrons, their actual magnetic 
moments being replaced by quantities of an electrostatic nature. It 
should be noted that only a part of the exchange energy can be inter¬ 
preted in this way; another part — A 22 8 oes over * n ^° ^ le ordinary 

electrostatic energy 2 2 ^or 

(<fc 

We shall consider in Part III some important applications of the 
quasi-magnetic effects determined by (359 b) to the theory of the mag¬ 
netic properties of atoms and of ferromagnetic bodies. Another illustra¬ 
tion of equations (359 a) will be found in the theory of the chemical 
forces between two atoms, inasmuch as no other type of degeneracy 
than that due to the exchange and spin effect has to be taken into 
account. 

The above theory can easily be extended to the more general case 
when an additional degeneracy (such as that due to the different 
orientations of the electron orbits in a complex atom) must be included 
in the perturbation problem. We shall not stop here, however, to 
examine this general case. 
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43. The Method of the Self-consistent Field with Factorized 

Wave Functions 

The reduction of the problem of many electrons to that of a single 
electron in that form in which it has been considered in the two pre¬ 
ceding sections is based on the description of the unperturbed motion 
of each electron in a given external field , that is, by means of an in¬ 
dividual wave function of a given form. Now in actual problems, 
connected with the structure of atoms and molecules, such a field can¬ 
not be defined beforehand in a way which would ensure the degree of 
accuracy of the zero-order approximation which is necessary for the 
successful application of the perturbation theory. We must now turn 
to the consideration of this problem, namely, the problem of the deter¬ 
mination of the ‘equivalent external field’ for the separate electrons 
forming a more or less complicated system (such, for example, as a 
complex atom). 

A relatively simple method which is quite similar to that used in 
the earlier (Bohr’s) quantum theory of complex atoms, consists in the 
identification of the external field acting on a given electron with that 
of a bare nucleus (or nuclei, if there is more than one) with an electric 
charge differing from the actual one by a certain constant, which, 
divided by the elementary charge, is denoted as the ‘screening constant’ 
and is to be chosen in such a way as to represent with the highest 
possible degree of accuracy the effect of the repulsive forces acting on 
each electron due to all the rest. 

To get a more exact description of this action it is sometimes pre¬ 
ferable to distribute the electric charge of all the electrons except that 
under consideration in a continuous way over some surface,'or in a cer¬ 
tain volume, with a uniform density or a density varying according to 
some more or less arbitrarily chosen law. 

In all these cases we get a problem containing a finite number of 
constant parameters which must be adjusted in a way leading to the 
least possible error. 

This problem is solved very easily—at least in principle—with the 
help of the variational form of the equations of motion, namely, 

f dV 

h J -- 0 , 

J dV 

where <f>(x v x 2) ...,x v ) is determined as the product of n individual func¬ 
tions ip 1 (x 1 ;a v b v ...) i 'frn( x n\ a n*b ni ...) of known form, 
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containing a number of undetermined parameters! a v a 2 ,..., etc. [cf. 
§ 9, Chap. II]. 

Under these conditions the expression W = J (f>*H<f> dVj J dV t 
which is equal to the energy of the system, is defined as a certain 
function of the parameters a, whose values must be determined from 
the equations 

dW A aw n 8W n dw A . 

da x db 1 da 2 8a n 


The equation 8JF — 0 can be used, however, not only to adjust the 
values of a finite number of parameters introduced in the more or less 
arbitrarily specified functions i/q,.,., but also to determine these func¬ 
tions themselves without the explicit introduction of any parameters 
(implicitly they are contained in the definition of the functions ip if the 
latter are supposed to be expanded in some sort of series). Now the 
factorized form of the wave function <p describing the behaviour of 
the whole system of electrons corresponds to the possibility of assigning 
to each of them a separate ‘orbit’, i.e. a motion independent—explicitly 
—of that of the rest (in the sense of the wave-mechanical probability 
interpretation). Inasmuch as the variational principle SIF — 0 ensures 
the highest accuracy of the results consistent with any given assumption 
about the character of the motion, we can thus state that the most 


accurate description of the motion of a system of electrons in terms of 
the quasi-independent motions of the separate electrons is obtained by 
defining the functions i/> 3 (r x ), ^ n ( x iX describing these individual 

motions, with the help of the variational equation, with 


<f>(x v ...,x n ) = <p i (x 1 )..4 n (x. n ). 


The above method has the advantage of avoiding the introduction of 
an arbitrary effective external field for each electron. Such a field is, 
however, introduced implicitly and can easily be determined in an 
explicit form, This is the so-called ‘self-consistent field’ which we have 
already alluded to many times, and which was applied for the first time 
to the problem of complex atoms by Hartree. 

In his original theory of the self-consistent field Hartree did not 
make any use of the variational principle (which was introduced for 
this purpose later on by V. Fock and J. C. Slater) but was guided by 
the idea that the action experienced by one electron due to the rest 
can be calculated approximately by distributing in space the electric 


t Wo shall leave aside for the time being the complications arising from the spin 
effect. 
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charge of the latter with a density proportional to the probability of 
their respective positions. The contribution of each electron to the 
probable density of charge p at a given point is obviously given by 
Pk =- e \ l f J k ( x )\ 2 under the condition that all the individual functions i/j 
are normalized to 1: r 

J \f k (x)\*dx= 1 

(where dx is an abbreviation for the element of volume dxdydz). The 
potential energy U i of the ith electron with respect to all the others 
can be determined accordingly by the expression 

U, = I u 1k , 

k /■ i 

where (J ik — c 2 J 1 . dx k , 

or, with a slightly different notation, 

u 'i(r) = e 2 f Tr 1 - y \<Pic(r')\ i dv. (360) 

J i r ~ r 

Adding to this expression the potential energy U {) ( r) of the external 
forces (which must obviously have the same form for all the electrons) 
and substituting the resulting effective' energy 

UAr) =- U oi (r)+U\(r) (360a) 

in the Schrodinger equation 

[- g ^ V! +P ( (r)-^ = 0, . (360b) 

we can determine the wave function i/j { describing the motion of the 
electron in question if the functions ip k (1c i) describing that of the 
other electrons are supposed to be known. Now as a matter of fact they 
are not known beforehand, each of them being determined through the 
rest by an equation of the form (360 b). We obtain in this way a system 
of n integro-differential equations which can serve for the simultaneous 
determination of all the n individual wave functions *\s n . 

It may seem at first sight that the total energy W of the whole 
system is equal to the sum of the individual energies This is, how¬ 
ever, easily seen not to be the case. In fact multiplying equation (360 b) 
on the left by tfjf and integrating, we have, in view of the supposed 
normalization of *p iy 

31 


3895.6 
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or, according to the definition of U it 


W; = j ^ 

whence it follows that 


h 2 

. Vr-i-CL-1- 

87r 2 m 1 1 01 1 



I Wi 


S*' 


' 1=1 Ifai. i 


^cfF, 


. h \ V; + f/ 0 /| 

. x Sir^m / . 

L»~ 1 ' i = l k¥-\ 

whereas the actual value of the total energy, corresponding to our 
approximation, is 

*’ - f *’ u * iv - J **[| b^<H| 


the mutual potential energy of all the electrons thus being doubled in 
the expression £ W*- 

In order to calculate the total energy W with the help of the 'partial 
energies’ we must introduce in addition the 'proper energies’ of the 
separate electrons 


*■=/* j *i-«L w ‘ +v «y iT - 

Denoting their sum £ E { by E , we get 


whence IF - i(tf+ 2 »j) - | 2 (J,+» r < ). (360c) 

' ' i -1 


It should be mentioned that Hartree’s self-consistent field can be 
defined cither by the resulting probable density of the electric charge 

n 

p = e 2 |^il 2 , from which the electric potential with due allowance for 

i= l 

the contribution of the external field can be derived by means of 
Poisson’s equation, or by the electric density p\ — p—pi = p—e^l 2 
and the potential energy (360) which corresponds to an electric field 
of specific form for each of the electrons. 

We shall now come back to the variational equation in the form 

8 J <f>*H<f> dV = 0, (361) 

with $ defined as the product ... ^ n (#„), and the n additional 

normalizing conditions J ipfipi dx = 1 or 

8 J i/jfifti dx = 0. 

We have 8 J dF = J dV + J dF. 


(361 a) 
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Now in virtue of the self-adjoint character of the operator H (which we 
shall suppose to involve real quantities only) we have further 

| dV = J dV, 

so that (361) can be written in the form 

J" H*H<t> dV + J 8<f>H(/>* dV = 0. 

Substituting here the product ^(a^) ...ip n (x n ) for (f> we get 

i f w n <k H<t>dv +1 / n dv = o. 

i = l J k^i i=l J k^i 

If we subtract from this equation the n equations equivalent to (361 a) 

| Ht n ftMdV + j Hi IT M* dv = 0 

J k-ti J k^i 

multiphed by suitably chosen parameters, say, we can equate to zero 
the coefficients of all the variations Sip* and Sip t (Lagrange’s method of 
undetermined multipliers). This gives 

(Hj—XiHi = 0, (362) 

where H t = j IT Trt <A* TI dx k (362 a) 

J k? i fc* i k^i 

is an operator which can be defined as the average value of the actual 
energy operator H for a given position of the ith electron and for all 
the configurations of the other ones. Similar equations are obtained by 
equating to zero the coefficients of the variations §«/q with H t replaced by 

Hf j XT IT 0* TI d' x k- They need not be considered separately 

^ k^i k^i k^-i 

for they are actually equivalent to the equations (362). The latter 
provide the mathematical justification for the physical principle which 
was used by Hartreef and are practically equivalent to Hartree’s equa¬ 
tion (360 b) if H is determined, as usual, by the formula 


H = % E(Xi,Pi)+% 2 2 F(Xi,x k )... 
1 = 1 t=l k^i 


(362 b) 


with 


F(x i ,x k ) — — • 

r ik 


The only difference between them consists, as is easily seen, in the fact 

that Hi involves in addition to the proper energy of the tth electron 
%2 

— Z7 0< and its average potential energy with respect to the 


rest, the average of the energies of all the other electrons. Hence the 

t It may bo remembered that essentially the same principle had been used before 
by Schr&dinger in connexion with his attempt to re-establish the wave theory of light 
emission on the basis of wave mechanics. (See Part I, § 17.) 
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constants \ appearing in (362) are easily seen to have the same value, 
namely, W , the total energy of the system. It should be mentioned 
that the normal state of the latter corresponds to the condition that 
W should have the least possible value of all the ‘stationary’ values 
which are allowed by the variational equation (361), in conjunction 
with (361 a). 

The preceding theory applies not only to a system of electrons but 
just as well to a system consisting of different particles or indeed 
of systems of any sort if denotes the totality of the coordinates 
specifying the state of the corresponding elementary system and if the 
total energy (362 b) is written in the somewhat more general form 

// .= ^ E ( (x it p,)+ 2 £ F ik (x t , x k ). (362 c) 

44. The Method of the Self-consistent Field with Antisymmetri- 

cal Functions and Dirac’s Density Matrix 

In the particular case of a system of electrons the accuracy of Hartree’s 
method is limited not only intrinsically but also by the fact that a 
specific distribution of electrons among the n orbits i fi n such as 

that defined by the function </> violates the identity principle. The 
function (f> defined by the product ^i(x x ) ...tfj n (x n ) must serve merely as 
a starting-point for the perturbation theory wdiich has been considered 
in § 37 in connexion with the exchange degeneracy. 

Instead of accounting for the latter a posteriori we can take it into 
account from the beginning if we replace the factorized function <f> in 
the variational equation by a linear combination of such functions, 
corresponding to the different permutations P of the electrons between 
the individual states tp v ..., if / n : 

X = lCpP<f>. 

p 

The functions i obtained in this way will of course be somewhat 
different from those which are defined by the equations (361c) and 
which do not involve the exchange effect. As to the coefficients C P , 
they can be shown to be the same as in the case of the perturbation 
problem corresponding to functions t/q,..., ifj n known a priori. 

We shall determine the latter for the antisymmetrical functions with 
spin which have been dealt with in the preceding section. We put 
accordingly 

• * • l l i l( x n>£n) 

X( x >() = C .= cy €p Pftz ,a (363) 

&>(*!. fl) • • • 'Pn( x n’U 
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where &(*,£) = 4>A X )KM) 

and <H X > f) = 4>i( X V it)- 'Pn{ X n> £n)‘ 

We shall further assume for the sake of simplicity all the individual 
wave functions with spin not only to be normalized but also to be 
mutually orthogonal in the sense of the equations 

I ( 363a ) 

f J 

It should be mentioned that if this orthogonality condition were not 
fulfilled for the original wave functions i p we could replace them by 
certain linear combinations satisfying these conditions. The a priori 
introduction of the latter does not therefore impair the generality of 
the theory. It serves, however, materially to simplify its external form. 
The normalizing condition for the fimction (363) under the assumption 
(363 a) gives C = 1 /*J(n\). 

It will be convenient in what follows to write x i for x it ^ and J for 
2 f, thus keeping externally the notation corresponding to spinless 
functions. We can formally proceed in the same way as if we were 
dealing with an antisymmetrical function (363) without spin.t Sub¬ 
stituting it instead of </> in the variational equation (361) (which in our 
case should be written in the form 8 £ J y*//y dV) and taking account 

( 

of the self-adjointness of the operator H, we get as before 

J 8 X *H X dV + J 8 X H X * dV = 0 (364) 

(the summation over the £’s being understood). 

Now we have according to (363) 

and further, since the integral J Ph<j>*Hx dV (or more exactly 
2 J P^<f>*Hx dV) does not change if any permutation, P~ l in particular, 
i 

is applied to all the integration variables, 

/ S X *H X dV = 2 J H*HP-' X dV ; 

or finally, since P~ Y x ~ € pX» 

J S X *H X dV = -JL 2 | H*H X dV = V(«!) J H*H X dV, 

l ' (364 a) 

t The variations 50 must of course refer to the factor tf» { (x) only, leaving the spin 
factor Bmi (() unaltered. 
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and in the same w ay 

J 8 X H X * dV = J HHx* dV. 

If we now substitute for II the operator (362 b) (by definition not 
involving the spin) and replace sj(n\)x by the expression 2 e p we get 

J h x *H x dV - 2 ej. I &<f>*HP(f> dV 

= I *p( 2 f S 4*mx„p t )P4 dV + 21 f s <f>*F( Xi ,x k )P<f> dv\. 

V r'A; J 1 

The integral J §<f>*E(x i ,'p i )Pif> dV, where 80 — £ 80* PI 0*, is easily 

j-=l Ar/i 

seen to be different from zero only if P denotes the identical permuta¬ 
tion (because of the orthogonality conditions J 0*0, dx == S w ) when it 
reduces to J 80* E(x i ,p i )i/j i dx i . We have further, if P is the identical 
permutation, 

J S^*/ 1 (a;,-, x k )P</>dV 

=- /J r ) s # (•<•*)+</'* ( r A-) s '/ , r (■'>',) j J 1 / 1 * a ) 

and 

J h<j>*F(x it x k )Pif> dV 

= JJ [^?(*.)S^*(^)+!/-*(a:*)Si/if(.T i )]f’(x i ,a: ft )^(a; fc )^(a: i ) cte. ; rf.r t 

if P is equal to the transposition T lk , i.e. the interchange between the 
ith and fcth electrons, and zero in all other cases. We thus get, on 
account of the symmetry relation F(x h x k ) ~ F(x k ,x i ), 

J 8 x *H x dV - i/rfx i 8^([i> l ,^)+ dx k F( Xi ,x k )\Ux k )\ l ]U*<)- 

- 1 2 . [ { dx k F(x t ,x k )^i(x k )^^k)]M x i)}- 

Putting for the sake of brevity 

A ki (x) = J F(x, x’WUx'WA*’) dx' (365) 

and B(x) -= J A kk (x) t (365 a) 

k =-1 

we can rewrite the preceding expression as follows: 
jh*H x dV 

= 2 J dxW(x)[[E(x, Pl )+B(x)]Ux)-£AM*)) = 0. (305b) 
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Subtracting from this equation and the conjugate complex equation 
J 8y#x* = 0 the expressions 

A« / tyf (*#*(*) dx + X kl | (*) ty k (x) dx = 0, 

which are derived from the orthogonality and normalizing conditions 
J dx ~ 8 iA: , and equating to zero the coefficients of the variations 
8i/rf, we obtain the following system of equations for the functions 

(E+B) ip^x)— ^ (A ki -\-X ki )^ k (x) — 0 , ( 366 ) 

k -1 

and a similar system for the conjugate complex functions. If we 
multiply these equations on the left by *p*(x) and integrate over x 
(including summation over f) we get, in virtue of the orthogonality 
and normalizing relations, 

fWzXE+BWtWdx- i f A ki \fif (x)t/j k (x) dx 

J k-1 J 

or, according to (365) and (365 a), 

A,J = E h + J J F(x, x')^f(x)^(x) £ ^*(x')i/j k (x') dxdx'- ] 

n , (366 a) 

— JJ F(x,x')4if(x)^ { (x')^i/it(x')tl) k (x)dxdx' J 

where E jt — j tji*(x)E(x, 'p x )<p i (x) dx (366 b) 

are the matrix elements of the proper energy of an electron [including 
its external potential energy U 0 (x)] with respect to the states i and j. 

Although the coefficients X ki are completely determined by these 
equations, they can actually be considered as arbitrary constants form¬ 
ing an Hermitian matrix, i.e. satisfying the relations A*. = X ki , and 
further subject to the condition that the diagonal sum 2 Ki should 

i 

have a given constant value. 

This conclusion follows from the fact that the set of normalized and 
orthogonal wave functions fa can be replaced by any set of linear com¬ 
binations of these functions, provided the transformed functions 

*'c = 2 Cnh 

i 

also satisfy the normalizing and orthogonality conditions. In fact the 
functions A ik are transformed by (365) according to the equations 

= J ^ Gfi C k'k^ik> 

i.e. like the components of a tensor in the n-dimensional space, whose 
coordinates are defined by the values of the n functions ^(r). The 
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latter can also be considered as the components of a vector referred 
to a certain set of orthogonal coordinate axes, being its components 
with respect to another system of such axes (with the same origin). In 
other words, the equations (366) can be considered as invariant with 
regard to all the orthogonal transformations or ‘rotations’ of the co¬ 
ordinate axes, if the coefficients X ik are likewise defined as the com¬ 
ponents of an arbitrary tensor A, the operators E and B being obviously 
scalars. 

As has been pointed out by Dirac, the arbitrariness involved in the 
determination of the components *pi(x) of a vector t|>(#) can be removed 
if instead of such a vector we consider its scalar product with the con¬ 
jugate complex of a vector ij>( x') associated with some other 

point x'. This product, which will be denoted as 

= (367) 

l 1 

is invariant under the above transformation and is therefore the only 
quantity that can be determined unambiguously in connexion with our 
problem. It can, moreover, easily be shown to be the only quantity 
we actually need know, the energy 

W = J x*Hx dV 

of the system of electrons being expressible as a function of p. 

In fact the preceding formula is reduced (in the same way as the 
expression J dV) to the form 

W = J 4>*H X dV-=ge P f <f>*HP4> dV, 

or, if the energy operator is defined by (362 b), 

W = J <t>*Hcf> dV - 2 1 <f>*HT ik <f> dV. 

Hence we get in the same way as in the derivation of (365 b) 

W = ttiu+l ff F(x,x')[p(x,x)p(x\x')-\p(x,x')\*]dV, (367a) 

i = l J J 

where \p(x,x')\ 2 = p(x y x')p(x' ,x). 

It should be mentioned that the sum £ X u differs from this expression 

by the absence of the factor \ in the second term which corresponds 
to the mutual energy of the different electrons, so that we can put 

w =\%(E H +X (i ). 

i-i 
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The quantities A^ are thus easily seen to correspond to the partial 
energies of our previous theory. 

e 2 

The integral | JJ F(x,x')p(x,x)p(x',x') dxdx' with F(x,x’) — —— 

represents the mutual potential energy which is obtained if the charges 
of the electrons are distributed in space with a volume density e\tp i (x)\ 2 i 

n 

ep(x,x) = e ^ |0d^)l 2 being the resulting density of the ‘electron cloud’. 

i=i 

This includes the action of an electron spread out into a cloud upon 
itself, which is devoid of physical meaning. Such self-action is, however, 
cancelled out by the second integral on the right side of (367 a), 

— J JJ F(x,x , )\p(x,x')\ 2 dxdx', 

which also represents the exchange effect or, as it is usually denoted, 
the ‘exchange energy’ of the electrons.f 

The first term in (367 a) does not seem at first sight to be consistent 
with the representation of the energy as a function of the ‘density 
matrix’ p. If, however, we introduce the elements of the electron’s own 
energy matrix E from the point of view of the coordinates x 

E(x,x') — J S(x~x")E(x",p x *)&(x"—-x') dx" 

(cf. § 17), we can put, since E i{ -= J ipf(x)Et/j i (x) dx, 

2 E iL — JJ E(x,x')p(x',x) dxdx '. (367 b) 

The fact that the energy W — J x*^X dF is expressed as a function 
(or rather a ‘functional’) of the density matrix p alone, shows that the 
latter can be determined directly without the functions ifj x (x),...,ijj n (x) 
which have initially served for its definition. Multiplying the equations 
(366) by ip*(x'), subtracting therefrom the product by ipi(x) of the corre¬ 
sponding equations for the conjugate complex of ipi(x'), and summing 
over i, taking into account the relations = A ki and \* k = A w , we 
can eliminate the coefficients X ik with the result 

[E(x,p x )+B(x)-E(x',p x ,)~B(x')] 2 iM*#*(*')— 

-II [A ki (z)-A ki (x')]Mxm*') = 0. (368) 

t As has beon stated at the beginning, the integration sign in tho preceding equations 
actually means both integration with regard to the geometrical coordinates and a sum¬ 
mation over tho spin coordinates. The latter can easily be introduced explicitly in the 
final results. They are, however, wholly irrelevant so long as we are dealing with a 
spinless energy. Their only effect is to allow the introduction of doubly occupied spinless 
states tpi{x) (with opposite spin) without the violation of Pauli’s exclusion principle. 
As a result we get a number of relations of tho form ~ A (l {x) — and 

Ay{x) -=- Ajy(a:) for indices i and k which correspond to identical spinless states. 

3695.6 ^ £ 
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If we substitute here the expression (365) for A H {x) with x' replaced 
by x” and similarly put A ki (x') — J F(x',x' , )ip*(x")i/f i (iz”) dx", we obtain 
the following equation, containing the density matrix alone, 
{E x +B x -E x —B x ,)p{x,x’)~ 

- J [F{x,x")-F{x',x"j]p(x,x")p(x’,x') dx" = 0, (368a) 

where E xi etc., is an abbreviation for E(x,p x ), etc. 

Introducing a matrix K defined from the point of view of x by the 
formula 


K(x, x ') 

where A (x, x') — F(x,x')p(x,x') 

and, according to (365 a), 


E(x , x , ) J r h{x~x , )B(x , )—A{x, x'), 

2 % ) 
r(x, x') 


(369) 
(369 a) 


B(x) — f F(x,x')p(x' ,x') dx ' = e 2 f dx', (369b) 

J J r(x, x ) 

we can consider the left-hand side of the equation (368a) as the (x,x f ) 
element of the matrix Kp—pK and accordingly rewrite it in the fol¬ 


lowing matrix form: 


Kp—pK = 0. 


(370) 


It should be mentioned that the matrix A(x,x') subtracts from the 
matrix B(x,x') — S(x—x')B(x') physically irrelevant terms correspond¬ 
ing to the action of an electron upon itself and at the same time accounts 
for the exchange effect. 

With the new matrix notation we can rewrite the expression of the 
energy as a function of p derived above 

W = f f dxdx' {E{x,x')p(x',x)-\-iF(x,x')[p(x,x)p(x\x')— \p(x,x')\ 2 ]} 

(371) 

in the form 

W - D[p(E+\B-\A)] = D[(E+\B-\A)p], (371a) 

where D(M) is an abbreviation for the so-called diagonal sum (German, 
Spur), i.e. sum (or integral) of the diagonal elements of the matrix M\ 
in the present case we have 

D(M) = J M(x,x) dx. 


The equation (370) which is satisfied by p can be obtained directly, 
i.e. without the use of the functions from the variational 

equation SW = 0. With the expression (371) for W we get, since 
F(z,tf)*=F(z' 9 z), 

SW = JJ dxdx' {E(x,x')Sp(x',z)+ 

+ F(x, x')[p(x 3 x)Sp{x', x f )~p(x, x')$p(x f , x)]} 
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— j*J dxdx' [E(x,x')+&(x—x')B(z')—A(x,x')]Sp(x',x) 

— JJ dx'dx &p(x i x')[E(x',x)+h(x , --x)B(x)—A(x , ,x)], 

that is, according to (3(39), 

SW = D(8p K) = D(K 8/d) - 0. (371 b) 

It must not be concluded from this equation that K — 0, for the 
matrix p satisfies a certain accessory condition which is obtained by 
comparing it with its square. 

We have in fact, from the definition of matrix multiplication, 
P 2 (x,x') = f p{x,z”)p{x”,x’) dx" = 22 (* h(x)$(zltk(xWk( x ”) dx " 

J i k J 

= 2 £ 'A =--- 2 i/j { (x)i/i*(x') = p(x,x') 

(because of the orthogonality and normalization of the functions i p i ) 1 i.e. 

P 2 = P- (372) 

It follows that S/d = p 3p-f~8p p, that is, 

hp(x',x") = J p(r',x'") §p{x"',x") dx'" + f 8 p(x\x"')p{x"',x") dx'" 

which in conjunction with (371b) leads to (370). 

The relation (372) shows that the characteristic values p of the matrix p 
are equal either to 0 or to 1 (since they satisfy the same equation p' 2 --- p'). 
We thus obtain, according to Dirac, a new formulation of Pauli’s 
exclusion principle, for although the matrix p can be introduced ir¬ 
respective of the statistical properties of the particles under considera¬ 
tion, yet it can be shown to possess a dynamical meaning—in the 
sense of describing the motion of a system of particles—for the Pauli- 
Fermi statistics only (see below, p. 403), so that Pauli’s principle is 
expressed implicitly by the property (372) of p. 

If in the equation (367) wc sum over all values of i specifying a 
complete set of individual wave functions t/j i —which corresponds to 
n -- oo, so long as all these wave functions arc normalized and 
orthogonal to each other, we obtain 

p(x 9 x') = h(x-x') (372 a) 

This expression can be used as an approximation to p for large values 
of n. It is easily seen to satisfy the relation (372). 

The preceding results can be generalized for a non-stationary motion 
of the electrons, determined, in the method of the configuration space, 
by the equation 



( 373 ) 
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In order to obtain the corresponding generalized form of the equations 
(366) or of the equation (370) for the density matrix, we need only 
remember that the equation (373) is equivalent to the variational 
equation r , h 

J 8 *T+= < 373a > 

[cf. § 26, eq. (207 a)]. Now 

/ s **t = 2-•' J S *’s p * dP -J s *’w dr 

-2Mt+(2 

Putting for the sake of brevity 

. « 

dx = a t , 2 a i — b > b ~ a i = b i> 


-A [d,* d -h t 

2 rrij Yl dt 


we thus get 


Equating this expression to the expression (365 b) for J &x*Hx dV, and 
taking account of the orthogonality and normalizing conditions in the 
form r 

J fyf( x )'l'k( x ) dx = 0 , 


(374) 


we get instead of (366) the equations 

{ E+B+ idi^ i{x) ~J 1 (Aki+bMx) = °’ 

where b ki are numerical coefficients, or more generally functions of the 
time. 

These can be determined in the same way as the coefficients A^, 
i.e. by expressions similar to (366 a) and differing from the latter by 


additional terms 


-■hr 

2 nij ■' 


Hi 

dt 


dx: 


bki — Ki + 


SiJ* 


Hi, 




(374 a) 


Hence we see that they must satisfy the conditions 


bjd — bf k 


and are otherwise quite arbitrary. Taking the sum 
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we have, according to (374), 

Z b « = l *«- | J 'PtiE+BWi dx + || J (A ki +b kt )M k dx. 

Now in view of the orthogonality and normalizing relations 

2 ^ J ^ki dx — ^ b u . 

The sum 2 thus drops out of the preceding equation which reduces 

i 

to the equation (367 a) for 2 

i 

The arbitrary coefficients b ki can be eliminated from the equations 
(374) in the same way as from the equations (366), namely, by multi¬ 
plying (374) by i/j*(x'), subtracting the product with 4*i{ x ) the corre¬ 
sponding (conjugate complex) equation for and summing over i. 

We thus get, instead of (368), 

= <*.+*,-**-w.*v 

- J [F(x,x")-F(x’,x")]p(x,x")p(x",x') dx", (375) 
or in the matrix form corresponding to (370) 

<375,) 

This relation should be distinguished from the expression 
, Jl dM trTii/r urn 

+53 S “ 

for the time derivative of any matrix or operator which is specified in 
terms of the same variables as the Hamiltonian H of the system and 
which does not contain the time explicitly. If K is considered as the 
energy matrix of our system of electrons reduced by the method of the 
self-consistent field to a single particle, then the expression 

^(Kp-pK) = [K, P ] 

gives that part of the time derivative of p which corresponds to the 
rate of change of the dynamical variables (x and p x for example) 
through which it can be expressed. The total derivative of p with 
respect to the time will thus be 

a - i + T '*'-'*’- 0 <375b) 

in virtue of (375 a), which means that p is a constant of the motion 
determined by K. 
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The total energy of the system of electrons W as given by (371) or 
(371a) is not a matrix but an ordinary number. If the external field 
involved in the proper energy of an electron E does not depend upon 
the time, W can be shown to be a constant of the motion. We have 
in fact, in exactly the same way as in the derivation of (371 b) 



or, according to (375 a), 

D ^ K P-P K ) K \ = D[KpK]-D( P K>), 


which is easily seen to vanish 
The matrix 


P (K+\B-\A) -= bp(E+K) 


could be formally defined as the energy matrix of the system of elec¬ 
trons, without, however, attaching any dynamical meaning to this 
definition, for it is the matrix K only which is entitled to play the role 
of the energy matrix for a single particle. The matrix K differs from 
an ordinary energy matrix such as E, by the fact that it is itself deter¬ 
mined by the character of the motion, and that accordingly it cannot 
be represented by an operator of the usual form (p 2 /2m)+C7(.r) even 
with an unknown potential-energy function U(x). 

One might be tempted to replace the equation (375 a) by an 
equation of the usual Schrodinger type 


h d 

277 i dt 


<A 


K4>{x). 


The latter can in fact be shown to be equivalent to (375 a) or to (375) 
in the special case of a single electron (but not otherwise). Replacing 
K by an energy operator of the ordinary type, E say, multiplying the 

equation Ad... . 

— 4 >( x ) A*#*) 

Ztti dt 


by </,*(#') and subtracting from it the equation 

multiplied by ip(z), we get 




which is a special case of (375 b). The equation (375) or (375a) can 
thus be considered as the generalization of the wave mechanics of a 
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single electron, which makes it possible, through the introduction of the 
density matrix p(x , x') instead of ordinary wave functions, to describe 
the motion of a system of n electrons in exactly the same way (with 
a modified definition of the energy matrix K) as the motion of a single 
electron. 

The complete disappearance of the number of electrons from the 
equations of the general theory seems at first sight very puzzling. This 
number must obviously be introduced a posteriori as an integration 
constant, or more exactly as a sort of quantum number, specifying the 
system under consideration. 

We thus see that the theory of the density matrix naturally leads 
to a further development of quantum theory in the sense of second 
quantization discussed already in Part I (see next chapter). 

45. Approximate Solutions (Thomas-Fermi-Dirac Equation) 

Using Dirac’s notation for the matrix elements and for the wave func¬ 
tions we can transform the matrix p from the point of view of x to 
that of K with the help of the following equations: 

{K’\p\K") = JJ (K'\x') dx' {x'\p\x") dx" (x"\K"), (376) 

the matrix elements (K'\p\K") and {x'\p\x") being both of the ‘pure’ 
type, corresponding to a definite point of view. We can, however, define 
in a similar way the mixed elements of p corresponding to a ‘double’ 
point of view (A, E) which serves to connect the two matrices K and 
E with each other. These elements are given by the formula 

(E'\p\K') = JJ (E'\x') dx' (x'\p\x") dx" (x"\K"), (376a) 

which is similar to the equation 

(E'\K') = J (E'\x) dx (x\K‘) (376b) 

for the transformation coefficients (E'\K') (cf. § 18), and reduces to it 
in the limiting case n = oo according to (372a). The wave function 
(x\K') appearing in the transformation equations (376 a) and (376b) 
replaces in a certain sense the whole set of individual wave functions 
0 X associated with the given value of K\ in agreement with 
the fact that each of the n electrons on account of the exchange pheno¬ 
menon must be distributed over all of them. 

The introduction of the wave functions (x\K') —although it is by no 
means necessary nor even convenient—raises the question as to the 
possibility of representing the energy A as a function (of a perhaps 
unusual type) of the dynamical variables x and p (= p x ) used in the 
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wave mechanics of a single electron. Now, since K is defined as a 
function of the density matrix, this question amounts to the trans¬ 
formation of the latter from the original viewpoint of x to the ‘mixed’ 
viewpoint (x,p). In other words, we must find the transformation from 
the ‘pure’ matrix elements (x\p\x') to the ‘mixed’ matrix elements 
( P'\p\ x ')• This transformation is given by the equation 

J 0*1 p\p)dp(p\x) (377) 

or the reciprocal equation 

(*l p\p) = f Wp\x') dx' (ce», (377 a) 

where (x\p) — (p\x)* is the well-known function 

(r|p) = c <aB »-P/*, (377 b) 

x-p being an abbreviation for xp x -{-yp u - MjV The function (377 b) is 
understood to be normalized according to the condition 

J (x\p)(i>'\x)dx = Sp- 

We shall give here, following Dirac, an approximate solution of this 
problem by treating x and p as ordinary, i.e. mutually commuting, 
quantities in the sense of classical mechanics. The density p (as well 
as the energy K) will thus appear as a function p(x,p) of the coordinates 
and momenta of an electron, its product with the volume element of 
the phase space dxdp being proportional to the probability of finding 
the electron in this volume element, or, in other words, to the relative 
number of electrons to be expected in the latter. This physical meaning 
of p will become apparent from the following argument. 

Let us consider p(x,p) = p x (p) as a function of p for a fixed value 
of x and expand it in a Fourier integral*)* 

P»(f>) = (378) 

This expansion is quite similar to the expansion of a function of the 
time t , the coordinate of an electron for example, for a motion with 
a fixed energy W = 

w/h being the frequency. Now in the latter case the Fourier coefficient 
f Ww is well known (by the correspondence principle, Chap. Ill, § 12) to 
represent approximately the matrix element of / (W\f\W+w) for two 
neighbouring states with the energies W and W+w (provided w^W). 

t It should be remembered that x and p are meant to‘denote the triplets of coordinates 
and momenta, and that dx actually means dxdydz. 
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Since a coordinate x and the corresponding momentum p x are related 
to each other in exactly the same way as the energy and the time 
(being canonically conjugate quantities), the Fourier coefficients of 
p x (p) in (378) must likewise represent approximately the matrix ele¬ 
ments (#| p\x+£) of p. The function p{x,p) corresponding to the classical 
definition of p (as a quantity commuting with x) can thus he calculated 
with the help of the matrix (r|p|x') by the formula 

p(x,p) = I* (x\p\x-t-£)e i2irp 'l lh d£. (378a) 

Comparing this with (377 a) and (377 b) we obtain the relation 

(x\p\p) == p(x t p)e i2 ^ h . (379) 

The Fourier coefficients in (378) can be calculated by the formula 

p T £ — (x\p\x+€) = J p(x,p)e- i2rrp ‘V h dp/h 3 , (380) 

h 3 appearing instead of h because dp actually denotes here the product 
dp x dp v dp s . 

Putting here £ — 0, wc get 

(s|p|tf) - — J p(x,p) dp, (380 a) 

whence it is clear that p(x,p) can be defined as the probable number 
of electrons per volume h 3 of the phase-space. 

The preceding equations are obviously valid not only for the matrix 
p but also for any other matrix of the same type, and in particular for 
the energy matrix K . Expressing it as a function of the variables x, p 
we thus get K(x,p) = E(x,p)+B(x,p)—A(x,2)), (381) 

where E(x,p) is the usual (classical) expression for the electron’s own 
energy p 2 \2m-\- U(x), 

B(x,p) = B(x) J 8(£)e <2 * , -£J* rff; 

that is, B(x,p) = B{x) = e 2 f dx', (381 a) 

J r(x, x) 

the usual expression for the Coulomb energy of an electron in a cloud of 
electric charge with the volume density ep(x\x'), and 

A (x, p) = J {x | A \x+€)e i2np i ,h d£. (381 b) 

Taking for the matrix (x\A \x+%) — A(x,x+£) its expression (369a) and 
substituting for (x\p\x+£) the expression (380), we get 

A(x,p) = JJ F(x,x+£)p(x,p')e i2ir(p - p ’ ) M h d£dp'/h B . 

3 L 


8595.6 
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Now (P-P')■§ is the scalar product (p x ~-p x ^ x +(Py-Py^y+(Pz-Pz^z 
of the vectors p—-p' and Keeping the vector p—p' = g fixed and de¬ 
noting by 6 the angle made with it by the variable vector 5, we can replace 
the volume-element d£ (= d£ x d£ y d£ z ) by the expression 2tt^H |£ | sin 6 dd; 
since F(x, x +£) depends on the magnitude \%\ — r of the vector % only, 
we can carry out the integration over 0, keeping r constant. This gives 


-t» 

J e i 2 ntiii, ,j£ _ 2nr 2 dr J d(cos0) 


= 2;rr 2 d/ m{2Tr g r ' h) = 2 r dr 

ngr/h g/h 

and consequently 

oo 

J (x\A \x+£)e i2np 'l lh d$ — 2 er J ^ P jfaP .} J dr sin (27 rgr/h). 

o 

00 J 

Now I sinardr =-[cosar] 00 , 

J (X - 0 

0 

which is equal to 1/a-j-an indeterminate constant which can actually be 
dropped. In fact, if instead of integrating over r to oo we first extend 
the integration to some large finite value, R say, and pass to the limit 
R = oc, after carrying out the subsequent integration over //, the term 
containing R vanishes. We thus get finally 


A(x,p) = 


f pfr.P') 
J |P-P'| 


If the function p(z,p r ) is replaced here by the function f(x y p f ) = p/^ 3 , 
giving the probable number of electrons per unit volume of the phase- 
space, the preceding expression assumes the form 

a i \ e2 h 2 Cf(z,p')dp' /oort v 

J«p£ir|f . (382a) 

which shows that the exchange energy, being purely a quantum effect, 
vanishes with A, as of course it should provided the function / remains 
finite (which simply means that the number of electrons is finite). 

If the function f{x,p') vanishes for large values of p' y then for suffi¬ 
ciently large values of p we can put approximately 

f f( X ’P') dP „ 1 f — _L r>l r 


and consequently 


= ^S f{x,p ' )dp ' 


. . e 2 h z . 

A(X,p)c=!— y p(X,x). 
rrp 
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It was shown by Fock that this expression can be applied for all 
values of p in the case of an electron moving in the electric field of 
a number of other electrons, if its reaction upon the latter can be 
neglected. Thus, for example, if we consider an alkali atom containing 
n electrons of which n— 1 form its inner core, while one can be treated 
as an outsider (although it actually ‘dives’ into the core), then the 
effect of the interchange of roles between this outsider and the core 
electrons is the same as if the energy of the external electron were 
decreased by the amount (383). Fock’s formula can be obtained by 
applying the variation principle to that part of the total energy 
W = J x*Hx dV which, besides the proper energy E of the ‘external’ 
electron, contains terms representing its interaction with the other 
electrons (whose motion is supposed to be given, i.e. to remain un¬ 
affected by this interaction). 

Taking for W the expression (371) and putting p = p 0 +Pv we easily 
get for the part in question the expression 

TF, = J tfEfo dx + 

+ Jf F(x,x')[p 0 (x',x') Pl {x,x) — p 0 (x,x’) Pl (x,x')] dxdx' (384) 
where p^x.x') = i/q (x)ip*(x f ) 


is the contribution to the total density matrix p{x,x f ) of the electron 
under consideration and i/j x (x) its wave function. The latter is to be 
determined from the condition = 0 (the normalization condition 
being now irrelevant). This leads to the equation 

f E(z,p)+B 0 (z)-A 0 }l' 1 - W^ v (385) 


where B 0 (x) = e 2 f dx’ is the Coulomb energy of the electron 

J /*(#, x ) 

in question with respect to the rest, while A 0 is the operator of the 
exchange energy [including the physically irrelevant action of the 
electron upon itself which must be subtracted from 2? 0 (a;)]. It has an 
unusual form, being defined by 

p 0 (x,x ) 


h<H x ) — e * J 


r{x,x') 


dx'. 


(385 a) 


Now, as has been pointed out by Fock, the quantity l/r(a;,;c'), i.e. the 
reciprocal of the distance between the points x and x r , can be considered 
(if we leave aside for a while the spin coordinates) as the matrix element 
with respect to x and x’ of the operator — 4tt/V 2 , where V 2 is Laplace’s 
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operator. This follows from the fact that 

4(x) = f f{X \dx' 

J r(x,x) 

is the solution of the equation 

= — 4tt/, 

which can be rewritten symbolically in the form 


In applying this result to (385 a) we must take care of the fact that 
1/V 2 operates on a function of x' leaving x constant. We must accord- 

n— l 

ingly come back to the original expression p 0 (x,x') = 2 for 

t-i 

the density (where n~ 1 denotes the number of electrons in the core) 
and insert the operator 1/V 2 between the i/j^x) and tpf(x') of the separate 
terms. 

h 2 

Since we obtain the following expression for Aqi/j^x): 

h 2 

A^x) = - 

7 T *—i 


where x' in *p*(x') and ip^x) has been replaced by x in view of the fact 
that p~ 2 , by definition, converts a function f(x') of x’ into a function 
<f)(x) of x. 

If we now wish to consider the approximation corresponding to the 
classical mechanics we must treat p as an ordinary number, which 
enables us to rewrite the preceding formula as follows: 


AMx) 


2 Po(*>a#i(z) 


and leads us back to the expression (383) for the operator A 0 . 

We have hitherto made no explicit use of the spin variables which 
were understood to be included in x and p whenever they were neces¬ 
sary. It is easy to rewrite the preceding equations with an explicit 
notation for the spin variables. So long, however, as the dynamical 
effects of the spin are neglected, its only influence will be to double the 
maximum value of p(x,p) which is allowed by the exclusion principle. 
As has been stated above, p(x,p) can be considered as the number of 
electrons per volume A 3 of the classical phase-space (which, as we 
know, corresponds to one single spinless state in the sense of classical 
mechanics). Inasmuch as the inclusion of the spin allows each spinless 
state to be doubly occupied (by electrons with their spin axes in 
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opposite directions), the effect of the spin will be simply to increase the 
maximum value of p(x,p) from 1 to 2. 

If we consider a system of electrons, such as a complex atom, for 
example, in the normal state, i.e. in the state of lowest energy W , we 
can assume all the individual states of lowest energy to be doubly 
occupied, or, in other words, all that part of the phase-space x, p which 
corresponds to the least possible value of the energy to be filled with 
the maximum density p = 2 and the rest to remain quite empty. The 
shape of the boundary surface can be determined from the condition 
that p is a constant of the motion as determined by the energy K. This 
means, since dpjdt = 0, that p must be a function of K, and that con¬ 
sequently the boundary surface we are looking for must be a surface 
of constant K. 

Now we have 

K - E(r,p)+B(r)- £ f dp’, (386) 

w* J |P—P | 2 

where r is written instead of x in order to indicate the fact that we no 
longer include the spin coordinate. Since within the part of the phase- 
space which comes into play 

/>(r,p') = const. — 2, 
the preceding equation is reduced to 

*-.*(r,p)+B,r,— £( J (386a) 

where the integral 

dp ' \ r r r __ dp' x dp' v dp' z _ 

IP-pTJr JJJ (Px-P'x) 2 +(Pv-Pv) 2 +(Pz-Pz) 2 

must be extended over all the saturated part of the momentum space 
p' which is associated with a given point of the ordinary space r. 

In order to evaluate this integral we must make some assumption 
as to the shape of this saturated momentum space. We shall assume 
it to be spherical, its radiuB P T being a certain (for the present undeter¬ 
mined) function of r. We then get 

2e*\P*-\p\\ 

A(r,p) = X[ p I 08 

We have further 

P(r, r) = i J p(x,p) dp = Pi, (388) 

g „ r pz t 

and consequently B{v) — -^e a J 


Pr+P 


Pr-P 


+ 2 P r . 


(387) 



(388 a) 
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At the boundary surface we must have p = P, and consequently 

4 e 2 

^(r.Pr) = x Pr ' 

The equation of this boundary reduces accordingly to 

A(r,p) =-- E(r, P r )+B( r) —P r ~ const. (389) 

h 

This equation serves to determine P as a function of r. It can be 
replaced by a differential equation of the Poisson type if we take into 
account the fact that B( r) is the product of the charge e of an electron 
and the potential </> due to a distribution of charge with a density 
(388) multiplied by e. We thus get, applying the Laplace operator V 2 
to the equation (389) and assuming F(r,p) to be of the usual form 
p l l(2m) J rU(T) with V 2 t/ = 0, 

or v< (S-t’ p ) “ 4 “V(r,r) = w‘ p '' < 390 > 

This equation (due to Dirac) is a generalization of the equation of the 
Thomas-Fermi theory which has been considered in Part I, § 32. It 
differs from the latter by the additional term — 4 e 2 P/h which represents 
the exchange effect (and also eliminates the self-action of the electrons), 
the electric potential or the density function being replaced by the 
function P. 



IX 


SECOND (INTENSITY) QUANTIZATION AND 
QUANTUM ELECTRODYNAMICS 


46. Second Quantization with respect to Electrons 


The reduction of the problem of the motion of a number of identical 
particles to that of a single one, earned out in the preceding chapter, 
involves a more or less rough approximation. A similar reduction can, 
however, be achieved in a different way, which corresponds to the 
method of copies which was sketched in Part I, § 20, and is connected 
with a quantization of the amplitudes of the waves representing the 
motion of a single particle. This procedure may be denoted as ‘second’ 
or ‘intensity’ quantization. 

This method was inaugurated by Dirac in connexion with the 
theory of light quanta for a system of particles which are describable 
by a symmetrical wave function. We shall, however, develop it 
in the first place for a system of electrons which will lead us to a 
generalization and improvement of the results obtained in the pre¬ 
ceding chapter. 

In describing a system of N electrons we have used hitherto only 
N individual wave functions e),..., i/j n (x) which enable us in 

the case of stationary states to account for the exchange degeneracy 
only. We shall now introduce an infinite .» set of mutually orthogonal 
and normalized wave functions of this sort (with spin), leaving their 
form undetermined for a while (they may, for example, represent the 
motion of an electron in the external field alone with neglv ct of its 
mutual action with all the other electrons). We shall further combine 
them into sets of N functions and for each set form an antisymmetrical 
function x in the same way as before. Instead of, however, identifying 
X with the exact wave function Q(x 1 ,x 2 ,...,x Nf t) describing the motion 
of the electrons, we shall define the latter as' a linear combination of 


all such functions, 


a = ic nX „, 


(391) 


and shall determine the coefficients C n as functions of the time in such 
a way as to make Q an actual solution of the exact wave equation 


(which can involve the spin variables). The coefficients C n satisfying 
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this condition are determined by the well-known equations of the per¬ 
turbation theory , 

= ,3M * ) 

where H, m . = J x * H Xn . dX (392 b) 

(the ‘integration’ over the coordinates X of all the electrons being 
understood to include a summation over the spin coordinates). 

The indices n specifying the functions x can be considered as repre¬ 
senting the totality of the numbers n v n 2 ,..., n r ,... corresponding to the 
individual wave functions e/q, ifr 2 ,..., 0 r ,... and equal to 1 if these func¬ 
tions are included in the set forming and to 0 in the converse case. 
Thus n r = 1 if the function ip r is contained in x« and 0 if it is not con¬ 
tained in it. We could also write more fully x n = x( n v n 2 y-y n n^^) 
and G n = The numbers v r may be denoted as the 

partition numbers, indicating whether the corresponding rth individual 
state is occupied by an electron or not. In calculating the matrix 
elements (392 a) we can use the formula [cf. (364 a), § 44] 

= / xS H Xn . dX = V(W) / ft H X> , dX HP<f, n . dX, 

(393) 

where <f> n and <f> n . are the factorized wave functions corresponding to 
a definite distribution of the electrons between the N occupied states, 
for instance, ^ (X) = ... + rx (z x ) (394) 

and 4> n ,(X) = •prfaWr'Mi) - <f>r' N ( x A’)- (394 a) 

It will be convenient to assume for a while that the indices r v r z ,...,r N 
of the occupied states are arranged in the same order as the indices 
1, 2,,.., N of the electrons, i.e. that 

r i < r 2 < V(394b) 
This means merely a certain (arbitrary) denomination of the N wave 
functions </r forming the set under consideration. We could put, for 
example, so long as we are concerned with this particular set, r x = 1, 
r 2 = 2 ,..., r N — N. The order of the indices in the other sets n ' must 
of course be left arbitrary. So long as Ii has the usual form 

H = lE(x hPi n1lF{ Xi ,x k ) 

i-i <<F 

the matrix elements (393) will vanish identically if the set n ' differs 
by more than two individual states from the set n (in view of the 
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orthogonal property of the wave functions 0). We must therefore 
distinguish three cases: 

(1) n = n' , i.e. n r = n' r , for all values of r, or simply x n — Xu' I n 
this case the matrix element (393) reduces to the value of the energy 
W already calculated in § 40. Putting 

/ i/>*/?</«,.■ dx ~ E rr - (395) 

and Jj il>*(x)<i>*(x')F(x,x')>lt r .{x)^,.(x‘) dxdx' = F rtrya . (395a) 


we can rewrite it in the following form: 

IInn = lErr+11 (K e ,rs~K^ ' ( 390 ) 

r r<s 

which is easily seen to coincide with (368 a). 

(2) The set n differs from n’ by the fact that one function, <f> ri (x) say, 
is replaced by another, all the other factors in (394) and (394 a) 

being the same. We then get in a simijar way, putting r t = p and 

r ' i= P'’ H nn , - E pp .+ 2 (F pr:p . r -F pr:rp .), (396a) 


where the sub-subscript i has been dropped. 

(3) The set n differs from n' by the fact that two functions *p Ti (x x ) 
and ip n (x k ) are replaced by different functions (not belonging to the 
original set) ^ r ;( x i) and 'Prfok)- We get in this case, writing p for r i and 
a for r k , = F pl] . pV -F pg . iV (p < q; p’ ^ p, q ^ q). (396 b) 

Let ot r denote an operator which when applied to C n =- C(n v n r ,...) 
increases n r by unity if n r — 0, that is, transforms 
C(n v ri 2 ,..., n r _ l9 n r , » r+1 ,...) 

into C(n 1) n 2 ,...,n r _ v n r +l,n r + 1> ...)\ if, on the other hand, n r — 1, the 
operator a r reduces C to zero. Let, further, denote an operator 
which decreases n r by 1 if n r — 1 and reduces C to zero if n r — 0. 
The coefficient C n >, corresponding to case (2), can be written accord¬ 
ingly as <x\oL r 'C f0 and the coefficient C n . corresponding to case (3) 
as ajaj <x r >(x 8 ’C n . It is now possible to write the equations (392 a) as 


follows: 


h dC„ 
27 n dt 


K„C n 


(397) 


where 


K n = lE rr +^ZE pp .ot p « p .+ 

r v W 

+ IKK 


Ks-,J+1X I + 

P*P r<p 

+ 12 I I (WWW-vv ( 397a > 

p<q p i-p q‘ *q 

3 M 


3596.6 
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The summation over r and s includes all those individual states which 
are contained in the set n, i.e. represented by partition numbers n r and 
n 8 equal to 1. The summation over p and q is extended only over such 
of these states as are replaced by one or two other states in the set n'. 
Thus the indices p , q on the one hand, and j/, q' on the other, are not 
independent of each other. The expression (397 a) can be further simpli¬ 
fied with the help of the relations <x r C n ~ 0 if n r — 1, aj. C n = 0 if 
n r = 0. Applying the operator cx r to C n and omitting all the argu¬ 
ments except n r , we thus get 


and 


= { C< "' # +1) 




if n r = 0, 
if n r — 1, 

if n r — J, 
if n r = 0. 


Under these conditions it is possible to represent the operators a r 
and a* as matrices from the point of view of n ry considered itself as a 
diagonal matrix 

n r = 



with the characteristic values 0 and 1 or, what amounts to the same 

^ It should be mentioned that 


thing, as a one-column matrix n r 
the difference 1— n r must be defined accordingly as the matrix ^ ^ 


or J respectively. 


10 . 

Regarded from this point of view the operators and ol\ are repre¬ 
sented by the matrices 

_ /0 1 \ t /0 0 

“ r \0 ()/' \1 0 
satisfying the relations 

4 a r = n n «r <4 

Any function C(n v n 2 ,..., n r ,...) of the matrix arguments n r , or more 
exactly of their characteristic values, must likewise be dealt with as a 
matrix. Leaving all the arguments but n r aside, we can define it as a 
one-column matrix . nt u 

CM - Q <»> 


n 

(398 a) 

l—n r 

(398 b) 


whose elements correspond to the two characteristic values of n r . This 
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gives, according to the definition (398 a) of the operators a, and c4 

«r CK) = H- “r^K) = ( 0 J 0) j. (399a) 

which is in agreement with the original definition of and a*. If 
therefore we agree to consider the partition numbers as the charac¬ 
teristic values of the corresponding operators (which will be denoted 
by the same letters), we can rewrite the sum of the first two terms in 
(397 a), corresponding to the proper energy of the electrons E, as follows: 

I E rr+ 2 2 E pp . «J, V = f | E„. «t «v, (400) 

r VP' r-lr' = l 

since for all values of r and r' which are not actually represented 
in the sum on the left-hand side of this equation, the operator <4<v is 
equivalent to 0. 

Turning to the other terms of (397 a) which correspond to the mutual 
energy of the electrons, we shall show in the first place that they can 
be collected together in a form similar to the last term with no other 
restriction imposed on the summation indices p, q , p f , q' than the con¬ 
dition p < q. 

In fact the second term is easily seen to be obtained from the last 
if we put q f = q = r and interpret the product a* as the operator n r . 
Since this operator commutes with aj 7 (so long as p =£ q) we can write 
it on the left of a* and extend the summation over all values of r which 
are larger than p } those terms which correspond to values of r not 
represented in n being automatically cancelled. 

It should be emphasized in this connexion that the order of the four 
factors in the last term of (397 a) is not taken at random, 

but precisely with a view to ensuring the inclusion of the preceding 
term under the condition q' = q = r. It is easily seen in the same way 
that the first term containing F in (397 a) is obtained from the following 
one if we put p' = p } or consequently from the last term if we put 
simultaneously p — p* — r and q = q f = s with the one restriction 
r < s, i.e. p < q. We thus get 

2 (Epr-.p-r—Fprsp-Wi, <V + 
r<8 p*p' r<p 

+ 2 22 2 (W _ WK t 'j a «' a 5 >' ( 400 a) 

P<Q, p'*P,q'¥>q 

“ 22 ^ 2 (F pq; p' q F pm ' v ')aL^ aj 0C q , OLp'f 

, p<q p 

and consequently 

= K = 22 E T T' l 4 a T + 22 2 “I <V“r'> (400 b) 
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where the indices p and q have been replaced by r and s in order to 
indicate the fact that the summation can be extended over all values 
of these indices, the terms represented by non-vanishing partition 
numbers n T , n 8 corresponding to the state n being actually the only ones 
left. This is why we are now entitled to drop the index n for K. If instead 
of writing ajaja^a,., in the second term of (400 b) we had written 
aj aj <v <V> ^ would have been impossible to include all the three terms 
of (397 a) containing F in one. 

The second step in the simplification of the operator consists in the 
removal of the restrictive condition r < s and in the simultaneous 
unification of the positive and negative summands in the second term 
of (400 b), representing the mutual action of the electrons. In order to 
carry out this simplification we must introduce instead of the a’s new 
operators = ±<Xr> 0 t = ±a j 

with an appropriate rule for chosing the upper or lower sign, so that 

we could write + 

a l(x r — a]a r (401) 

and <*l V tv = a r a t «V a r- = a l "I a r- "s' (401 a) 

“r t»J tv tv = -°r a l "1 " s '"r- (401 l>) 


ocl al ct r r — —al al K' a*> 


-a' a' a*a,>. 


This enables us to put 


2222 ( F nS»— F rs#'r’)<4 “1 £ 
r<« r’ s' 


2222 F rs,rv"r a l <V«r' + 2222 "I "r'"s 

r<s r‘ s’ r<s r’ s’ 

2222 F sr,»-r «n "l <V<V + 2222 F sr,r'»'"t "I a s' a r' 


in view of the obvious relation 


and the relations (401 a), or finally 

K = 22 E n- a l «y+ l 2 2 2 2 Ks.r's' a r "l "s'"r 


the summation being extended over all the values of the indices r, r', s } s' 
without any restrictive conditions whatsoever, all the restrictions being 
carried out automatically. 

In order to define the operators a explicitly we must take into account 
the condition which has been stated at the beginning of this section as 
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to the arrangement of the indices r v r 2 ,... specifying the individual 
states in the set n : . . 

r i< r 2< •••• 

This condition has been used in all our preceding expressions for K up 
to the expression (400 b), and has been dropped only in the last expres¬ 
sion (402). 

Now the operators a must be defined in such a way as actually to 
enable us to get rid of this condition. This can be done, following 
Jordan and Wigner, by putting 

a r — a r v n a\ = v r <4, (402 a) 

where v T is an operator with the characteristic values 1 and —1 (i.e. 
equivalent to taking a with the -)- or — sign) which is defined as the 
product J T _ x 

Vr-ntl-2^ (402b) 

*s- l 

The separate factors in this product are themselves operators of the 
same kind as v r (the characteristic values of n 3 being 0 and 1, those of 
the difference 1 — 2??^ must be +1 and —1). The operators a defined 
in this way are easily seen to satisfy the conditions (401), (401a), and 
(401 b), or the more simple conditions not involving the original a’s: 

a\a r = n r (403) 

a r a 8+ a 8 a r “ ajaj+fljaj — 0 (403a) 

and finally al a r .-\-a r al — h rr (403 b) 

We have in fact 

a} a r -= v r a lot r v r = v r n r v r = n r v\ = n n 
since n r is represented by a diagonal matrix, just as v r is and therefore 
commutes with v r) whose square is equal to 1, i.e. to the unit matrix 

n- 

Further, if r < s, we obviously have 

(the case r — s is devoid of interest since the operator a r a r applied to 
any function C n gives identically zero). On the other hand, 
a r a s = <x r v r a s v 8 = v r <x r cc d v 8 = v r (x 8 a r v 8 , 
since, according to the definition (402b), commutes with all the 
factors in v r . It does not commute, however, so long as r < s, with one 

t We slightly diverge bore from Jordan and Wigner by extending the product over 
a to s r — l instead of 8 = /*. 
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factor in v a , namely, (1 —2n r ). Applying it to the latter we have 
0^(1 — 2n r ) = oc r —2^^. 

Now ocj.Uf = K+IK if n r = 0, and 0 if n r = 1. In both cases we get 
<*r(l-2 n r ) = (1 2n r )<x r1 

and consequently o^Vg = — (r < s). 

The preceding relation can be derived in a somewhat different way if 
we replace n r in 1 — 2 n r by the product We then get 
oc^l — 2(4^) = o tr -2at r ((4a r ) = a r —2(a r aJ)a r 

= (1 2a r c4)a r — [1 2(1 w r )]a r 

according to (398 b), which coincides with our previous result. 

Coming back to our original expression for a r a s , we have 

a r a 8 = v r a 8 a r v 8 = — v r a 8 v a oc r , 

or, since the three operators <x K , commute with each other while 
v r commutes also with a r , 

—V» V T<Xr = :: = ~ a s a r 

The second relation (403 a) is proved in exactly the same way. 

In the case r r' relation (403 b) immediately follows from the rela¬ 
tions otla r = 7i r and a r <4 = l—n r (see 398b). In order to prove it for 
the case r < r* (or in general r ^r') we must use the fact that the 
operators aj and a r > commute with each other just as the operators 
<x r and ot r . do. We have further, if 1—2 n r is written in the form 

(1 —2 n r ) - —[1 — 2(1— w ,)] = -(1-2^4); 

4(i-2« r ) = - 4 (i- 2 « f 4 ) = -[ 4 - 24 ( 0 , 4 )] 

= ~[4-2(4 j( r)4] = — (i— 2w,)4> 

so that ol\ v r > = — v T . cxl (r < r') 

as before, and consequently, since 

oc r 'V r — v r a r ', 

a\ a r > — V r Oil a r ' V = <V v r <4 v r' ~ ~ <V v r v t' °i = ~ a r ' a l’ 

Now that the relations (403), (403 a), and (403 b) are all proved, we no 
longer need to think of the auxiliary operators a r and v which have 
been used in their derivation and which depend on the physically 
irrelevant order in which the different individual states are numbered 
in the set n . The above relations are §elf-supporting, for they specify 
in a perfectly unambiguous way the operators a which serve to 
express the energy operator of our problem K. These operators can be 
represented by certain matrices, from the point of view of the matrix 
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formed by the totality of the partition numbers n lt tt 2 ,..., in a way 
implying a certain ordered arrangement of the different individual 
states. So long as we are interested in one particular state only we can 
define the corresponding partition number n r as a two-dimensional 
matrix (398) and represent the operators a r and aj by the same matrices 


a r = 




:) 


as those representing the operators oc r and aThe difference between 
them and the operators a r , a£ becomes apparent when we take into 
account all the other states s (= 1,2,3,...). The general representation 
of the operators oc n oc f r can be derived from (398 a) by multiplication by 


unit matrices ^ j, referring to all the other states ($ ^ r): 

a r — XS r .n X ^ 0 ) X ^ r ~ 1 X ^-2X ••• X Sj 

<4 = S, X S 2 X ... X 8 r _] X °)x8, +1 x..., 


the ])roduct M x x J\I 2 of the matrices M x and M 2 denoting a matrix M 
whose elements are obtained by combining multiplicatively the ele¬ 
ments of M t and M 2 . In order to obtain the general representation of 

a r and a f r we must replace the r—1 matrices 8 1? S 2 . S r _ 1 by the 

matrices 1 — 1—2n 2 ,..., 1 —2n r _ 1 . The matrices so defined 

Or = -a r+1 (« Jj(l-2n r . 1 )(l-2« r _ 2 )...(l-2 Wl ) 

<4 = (] —2w 1 )() —2w 2 )...(l—“)s r+1 ... 

can easily be verified to satisfy all the relations (403 a, b). The totality 
of the numbers n v n 2 ,... must be represented accordingly by the product 
of the diagonal matrices representing each of them: 


n = n x xn 2 x .... 

This operator product has, of course, nothing to do with an ordinary 
product of the numbers which give the characteristic values of the 
operators n r and which, as will be remembered, must satisfy the 
relations » 

I n, = N. 

r -1 

The operators a r are not Hermitian, although the symbol a f r preserves 
its meaning as the operator adjoint to a n i.e. as the conjugate complex 
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of the transposed matrix a r = ^ They can, however, be ex¬ 

pressed with the help of the Hermitian operators 



whose products with hjAir represent the components of the electron’s 

spin [cf. § 29]. We have, namely, limiting ourselves to one particular 

state, / i +\ * / + \ 

<r x = (a r +a r ), <jy = i(a r —a\), 

whence «, = l(a x -ia y ), a* = tV,). (404) 

Hence it follows that 

n r = a\a r — a u a x )], 

or, according to (253), § 29, 

n r = + 

or, since cr% — ajj — 8, n r = |(S+ct 2 ), (404 a) 

which agrees with the definition n r = ^ It should be noticed 

further that , 0 , /s x 

1 — 2.71 r = h{0 — G z ), 

and that accordingly r , 

tt -* i 


the subscript s in cr Z8 serving to show that it refers to the sth state. 

We thus see that the energy operator K (402) can be expressed with 
the help of the familiar spin operators associated with the different 
states. This is natural if we remember that it is possible to represent 
the interaction energy of the electrons in connexion with the exchange 
effect with the help of the operators 4(l+a r -a s ), as has been shown in 
§ 42 of the preceding chapter. The problem is complicated in the present 
case by the necessity of introducing the operators v r in order to ensure 
the anticommutation of the operators a r and a 8 (or and a\) referring 
to different states (whereas the operators o r and cr a must commute with 
each other). 

The fact that the operators n r can have only two different charac¬ 
teristic values 0 and 1, which has been used as the basis of the above 
definition of the operators a r) can be considered as a consequence of 
the properties of these operators expressed by the- relations (403 a) 
and (403 b) in connexion with the equation (403) which from this point 
of view serves simply for the definition of the operator n r . We have, 
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namely, multiplying the relation a r a\ — 1 — n r on the left by a*, 
a\a r al = a* —a*?i r , 
or, since al a r = n r , = a,!— a* w r , 

whence, by right-hand multiplication by a r , we get 

w- = n r —o\n r a r . 

Now aj w r a,. -- aj a* a r a r , and according to the relations (403 a) we must 
have a r a r =-* aj aj = 0. We thus get 

n;—n r — 0 , 

whence it follows that the only characteristic values of n r are 0 and 1. 

The preceding theory can be put in a still more significant form by 
introducing the expression 

Y(z) = fa^x). (405) 

r-1 


Being an ordinary function of the coordinates of an electron (and 
eventually of the time), it is to be considered at the same time as 
an operator with respect to the amplitude coefficients C(n v which 

play the role of the wave function in the equation 


h d g 
2ni dt 


KC 


with the energy operator K defined by (400b). 

Multiplying Y (#) on the left by the adjoint operator 

*»*(*) = 2 a}tf(z), (405 a) 

and integrating over x (which includes as usual the summation over the 
spin coordinates), we get in virtue of the orthogonality and normaliza¬ 
tion of the function t/j r (x ): 

f Y f Y dx = 2 a} a r = f n r . (405 b) 

This equation is quite similar to that corresponding to the ordinary 
case of functions of the type (405) with amplitude coefficients a r defined 
as ordinary numbers. Replacing such numbers by operators satisfying 
the conditions (403 a) and (403 b)—or even the less restrictive conditions 
a r a r = 0, al al = 0, al a r -\-a r al — 1—we obtain for the number of elec¬ 
trons associated with any individual state one of the two characteristic 
values 0, 1 of the operator al a r = n r in agreement with the exclusion 
principle. The total number of electrons N can be defined accordingly 

oc 

as a characteristic value of the sum £ n r> so that it appears in the role 

3 N 


3685.6 
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of an additional ‘intensity’ or ‘quantitative’ quantum number (cf. 
Part I, § 20). The operator * 

N = I «, 

r = 1 

is easily seen to commute with the energy operator K in virtue of the 
relations (403 a, b) and to represent accordingly a constant of the 
motion—which means that the number of electrons forming any parti¬ 
cular system is constant—as of course it should be. The operator K 
can itself be expressed in terms of the operator-functions T (x) and its 
adjoint operator T f (^) not containing explicitly the operator-coefficients 
(i r . We have, namely, 

II E n .ala r .= f V(x)EV(x)dx 

r r‘ J 

and 

II 12 = f f 'VHX)'V\X')F(X,X')'V(X')'V(X) dxdx', 

r s r' s' * " 

so that K can be written in the following form: 

li - J f(*)CT(;r) ilx + 5 JJ 'Y'(x)'¥Hx)F(x,x')'i'(x')'¥{x) dxdx', 

(406) 

which is somewhat similar to the expression for the value of the energy 
W given by the equation (371), §44, if the density matrix p(x i x') is 
replaced by the product T f (o:)T(a;'). The main difference between the 
two expressions lies in the fact that the exchange effect which is repre¬ 
sented by the negative term under the double integral sign in (371) is 
not present in (406) where this exchange effect is automatically ac¬ 
counted for by the properties of the operators T(rr). 

Putting F(x, x') = e 2 /r(x,x'), which corresponds to an ordinary 
Coulomb interaction between the electrons, and introducing further 
the operator , , 

9 (z) = eJ (406 a) 

which represents the electric potential due to a distribution of electricity 
with a density ^ x > } = eY t (a . W)> 

we can replace (406) by an expression of still more familiar type, 

K = J (*)(#+ieqOYCc) dx, (406 b) 

corresponding to the average value of the energy W for an electron 
moving in an electric field which consists of an external part (included 
in E) and a quasi-extemal part, due, as it were, to its own field and 
represented by the electric potential <p with the extra factor J. It must 
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clearly be understood that no actual self-action of a single electron is 
implied by our theory, the commutation properties of the operator- 
functions Y being precisely such as to exclude any self-action. 

These commutation properties are easily derived from those of the 
operators a , and from the orthogonality and normalizing conditions for 
the functions tp r {x). We have, namely, multiplying Y(.r) by Y(£'), 
Y(x)T(a:') = 2 a r <jj r (x) £ aj s {x') 

= 22 a r a sM x W x ’) = - I 2 

/• « 

that is, T(*)TCO+^(^M - 0, (407) 

and likewise, t (a:')-h , f t (a;')Y t (a:) — 0. 

We have further 


r « 

= -22 "»«J 4>r( x )'P,( x ')- J r 2 2 S r» 

/• s r a 

whence + = 8(x-x'), (407 a) 

where 8(x—x') denotes the product of the Dirac 8 -functions for the 
geometrical coordinates by 8^ if £ and are the values of the spin 
coordinates associated with the points x and x . It should be remarked 
that the formula a* a r = n r is replaced in the present case by the 
formula (405 b) or 

v / m+/. \uj/_ v (407 b) 


f V(.r)V(x) dx = 2V. 


The functions ifj r (x) which serve to define the operator Y(x) have been 
left hitherto entirely arbitrary apart from the condition of being 
mutually orthogonal and normalized. The actual problem, which was 
put at the beginning of this section, was to find the coefficients C ki 
which determine the wave function Q = 2 @n Xn describing the behaviour 

n 

of the system of electrons under consideration in the configuration 
space. From this point of view the functions i/f r (x) play only an auxiliary 
role. 

But on the other hand, it is clear that the preceding theory can give 
results of real practical value only in the case when the separate anti- 
symmetrical functions x n form a good approximation to the functions 
Q w , which describe the stationary states of the system when the external 
field does not depend upon the time, or specific types of motion in 
a given variable external field. Assuming the latter to be constant, we 
are thus led to the problem of determining the individual wave func¬ 
tions if/ r (x) in such a way as to make the functions Xn the best possible 
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approximations to the exact antisymmetrical wave functions describing 
the stationary states of the system. 

Let us consider a stationary state of the system as determined by 
the exact equation AT; = wc (408) 


h dC 

2rri dt 


KC is reduced if the operator 


to which the general equation 

— is replaced by the characteristic value of the energy IF, i.c. if 

all the values of the function C(n v for which £ n r ~ A 7 are 

assumed to be proportional to e~ i2irWllh just as in the case of an ordinary 
Schrtidinger equation. We then get, denoting the amplitude of C n by 

fH) 

V 

configuration space): 


„ jr , the following exact representation of a stationary state (in the 


Oh- - I CZ l( -x,AX)' 


(408 a) 


where X denotes the totality of the coordinates of all the electrons. 
Now the equation (408) must obviously be equivalent to the varia¬ 
tional equation 8 IF — 8 J LlfyHLl dX 0 with written in the form 
(408 a), in conjunction with the condition C*C a — 1 . This varia- 

tional equation can serve for the determination of the coefficients CJ 
if the functions y n , i.e. the individual functions «/r r , are known. Or it 
can be used for the determination of the latter if the coefficients CJJ are 
known. Assuming them for the moment to vanish for all the subscripts 
n except one, we get back to the self-consistent field considered in the 
preceding section. 

The question we were discussing above is thus reduced to the fol¬ 
lowing one: Is it possible to determine both the functions ip r and the 
coefficients C n from the same variational equation 8 IF = 0 , where 
IF -- J il* y HQ w dX ? Such a determination is certainly possible for 
a function (408a) containing a finite number of terms; if only one of 
them is different *from zero we get back to the problem of the self- 
consistent field already solved. 

However, the solution thus obtained will contain, as in the simple case 
just mentioned, a certain amount of arbitrariness in the form of the func¬ 
tions ip r (x) (the latter being replaceable by any other set derived from 
them by a linear orthogonal transformation). This arbitrariness will 
increase with the number of non-vanishing terms and will become 
infinite in the limiting case of an infinite series (408 a). So long how¬ 
ever as we are looking not for a formal but for a practical solution of 
our problem, we can deal with it as if the number of terms in (408 a) 
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were finite; this procedure will ensure the most rapid convergence of 
the series obtained in the limiting case. Dropping the affixes IF and 0 
in (408 a), we have 

if = f n*mi dx = 1 2 <n «„■ f xt H Xn . dx, 

J H tT 

that is, IF = 2 2 C * H „n‘ c n; 

n n’ 

which according to our previous results can be rewritten in the form 

IF-2 <7* AT’,, (409) 

n 

The problem we have considered hitherto was equivalent to the varia¬ 
tion of the coefficients C ni the operator K being fixed. It could thus 
be expressed by the equation 

2 S<7* KC n + 2 C* KhC„ — 0 

n 71 

along with 2 SC'2T’„ + 2 c * "■ "■ 

n 7i 

which brings us back to the equation (408). The next step which we 
must undertake in order to secure the best possible approximation con¬ 
sists in the variation of the functions *ft r (x), i.e. of the operator K which 
they define. We thus get the additional equation 

2 C*hKC n = 0. (409 a) 

n 

With the help of the expressions (400) and (406b) this is easily reduced 
to the following form: 

2 <?;(8 Jr(*)[Jf+e9(*)]i'(*) dx]C n = 0, 

provided we consider the variations of the operators Y(x) and as 
independent of each other, apart from being subject to the condition 
J 8T' t (#)* v P(£) dx — 0 [and J v f' t (r)*S v F(.r) dx — 0]. Let us consider C and 
(7 t as a one-column or a one-row matrix and introduce the functions 
co = T6 Y , = (409 b) 

The preceding equations can then be rewritten as follows: 

J Sco t (z:)[^+e9(^)]^(‘T) dx — 0 

and J 8co t (a:)*co(r) dx — 0, 

where Sco f — 

Applying Lagrange’s method of undetermined multipliers we thus 
obtain the equation [E+e<f(x) _ W](l){x) = 0 

or [A+e 9 (x)-Jf]Y(ar)C = 0, (410) 
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where <p(#) is defined by (406 a) and W denotes a constant—the charac¬ 
teristic value of the energy we are looking for. This equation can serve 
for the determination of the functions \p r so lorlg as the coefficients C n 
are supposed to be known for any choice of these functions. The equa¬ 
tion (410) is a good illustration of the method of double quantization, 
as it is operational in the double sense of T(^) operating on the matrix 
G and E+ey(x)~ W operating on T(z). 

Assuming the first of these operations to be understood, we can 
rewrite the preceding equation in the form 

[E+e<p{x)— — 0 (410a) 

as for an ordinary wave function, with the only difference that the 
additional potential energy ey(x) is itself dependent upon the function 
T. It should be mentioned that this dependence can be expressed by 
the differential equation 

V 2 9 = -±7reV\x)Y(x), (410 b) 

which is equivalent to the integral expression (406 a) for the operator 9. 
This circumstance can be used, as will be shown later on, for a very 
important generalization of the theory in the sense of taking into 
account the exact electrodynamical laws which govern the interaction 
of the electrons. 

47. Intensity Quantization of Particles described in the Con¬ 
figuration Space by a Symmetrical Wave Function (Einstein- 
Bose Statistics) 

The reduction of the problem of a system of identical particles to that 
of a single particle—corresponding to the method of copies (Part I, § 20) 
—has been considered hitherto for the case of electrons—or more 
generally such particles as in the method of the configuration space 
are described by antisymmetrical wave functions. We are now going 
to consider the same question for the case of particles which belong 
to the symmetrical type, and conform accordingly to the statistics of 
Einstein-Bose (for instance, a-particles, hydrogen atoms considered as 
elementary particles, etc.). 

Let us start as before with a set of N different individual states 
specified by the mutually orthogonal and normalized functions ^(x), 
^ n (s). Introducing the factorized wave function 

we can define the symmetrical wave function describing the whole 
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system by the formula , 

(411 > 

which differs from the corresponding antisymmetrical wave function 
by the absence of the sign-factors e p . This is of course a slight simpli¬ 
fication so long as wc are considering a set of N different individual 
states. We get in this case from the variational equation 

8 J x*H X dX = 0 

in conjunction with the conditions 

J '/'?(•*#«(*) dx = S„ 

the following system of equations for the functions i/j r [corresponding 
to the method of the self-consistent field, cf. (366)], 

(E+B-A ii )Ux)+ 2 (A ki +\ ki )Mx) - 0, (411a) 

kti 

the functions B(x) and A ki (x) being defined by the same formulae, 
(365), (365 a), as before. The energy (or its probable value) is expressed 
accordingly by the formula 

V /(A T !) J 4>*H X dX = ^ j cf>*HP(j> dX, 

which gives 
that is, 

w = X (En- J A h ^(x)^(x) + 

+ 1 ff F(x,x')[p{x,x)p(x',x')+\p(x,x')\ 2 ] dxdx' 
or 

»r = 2 [ E ii~ jj F(x, x')\<pi(x) | 2 |^,;(x') i 2 dxdx'} + 

+ i|J F(x,x')[p(x,x)p{x',x')+\p(x,x')\ 2 ]dxdx’. (411b) 

We thus see that in the case of symmetrical wave functions the density 
matrix p(x,x’) cannot replace the separate wave functions. Using the 
notation (395 a) for the matrix elements of F and affixing the index 
n to we can rewrite the preceding expression in the form 

H„„ = f xt H Xll dX = 2 (E rr -F rr:rr )+ 2 2 (E r «r'+Er,.sr) (412) 

J r r : h 

similar to the expression (396) which corresponds to the antisymmetrical 
case with a similar condition as to the arrangement of the indices 
*1,in <f> = <l> ri {x 1 )<Pr t {x i )...ili r!l (xx). 
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If x„ i s replaced by a function x,,' differing from y„ that- a single 
individual function iff n is replaced by </y, we get likewise 

H nn ’ -= H},)>'+ 2 ^pr,i) , r J r^vr\rp')' (412a) 

r :p 

If finally two of the functions serving to construct y„, tfi p and ijs q say, 
are replaced by tiro new ones different from each other and from all 
the other functions in the original set, we get 

(p < f l)- (412 b) 

Let us now pass to the general case, which has no parallel in the theory 
of aritisymmetrical functions y, where certain individual states are 
multiply occupied , so that, for instance, each function ^ r (x) occurs n r 
times (u r > 0) in the set specified by the index u, where the sum 
2 n r must of course lie equal to the number of particles. 

The formula (411) will still be valid in this case, except for the 


normalization factor which must be replaced by 



if only 


effectively different permutations P are included in the sum (411), i.e. 
such permutations as interchange particles associated with different 
individual states. Thus, leaving aside trivial permutations, we can 
define the normalized symmetrical wave functions by the formula 



(413) 


where 



(413 a) 


We then get, instead of (412), 

^«b ~ Z n r{^rr~~Kr\rr)-\" 1L 2 n r n t,(Ks;i'S J r Ks;*r)‘ (414) 

r r< s 

The calculation of the matrix elements corresponding to (412 a) requires 
a little more care. 

We must in the first place determine the number of times a given 
variable, x r say, will be met in the sum 2 P<f> n associated with a certain 

p 

function ip p . The number is obviously equal to 

(A r —I)! 

8^p 

The function <f>„ will differ from <f> n by the fact that it will contain 
n p ~ ftp— 1 factors \p p and n p . = n p -~\- 1 factors Now the matrix 
element of E r with respect to the functions P<f> n and P'<^ n > will be 
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different from zero only if the variable x r is shifted from ip p to all 
the rest remaining in their places. There will thus be n p gJN terms 
equal to (E r ) pj/ . Now since Ii contains the sum of N terms E n corre¬ 
sponding to the proper energy of all the N particles, the matrix element 

E pp ' will appear in the expression P<fr n HP'<l> n ' just n p g n times. 

p p' 

The coefficient of E pp . in H nn will thus be 

Mn = n llh\ = Vn„V(V+ 1 )' 

The same argument apples to the second term in (412 a) which corre¬ 
sponds to the mutual energy of the particles and must besides be 
multiplied by n r . We get accordingly, instead of (412 a), 

H nn ‘ = 'j(top)yl(n p ’+\)\E pp '+ 2 n r(Fpr;p'r + Fpr;ri>')\' (414a) 

L r>p J 

By a similar argument we obtain, instead of (412 b), 

H nn ’ = ^p^g^i^p'+^^q^^K^pq'ip'q' + ^pqiq'p')- (414b) 

If we substitute these expressions in the equations 


h dC n 
27 n dt 


1 H nn -Cn- 


which determine the coefficients of the expansion Q — 2 0 n x n ar *d 

n 

introduce the operators cx r already used, we can bring them to the 

standard form 7 

_hdC KC " 

2m dt 


with the following expression for the operator K : 

K = 2n r E„+ XZ 'ln lt (\ln p -\-\)E I>p .«l<x p — 

r p*p' 

— 2 n r^rr;rr+ 2 2 n r n 8 (^rs;r8~^^ra;sr) + 
r r<s 

+ 22 ^ n p\!( ri p fJ r^) 2 n r(Fpr; 2 /r^Fpr;rp'Wp <V + 

p¥-jr r>p 

+ 2 2 2 2 ^ w p V» a V(V+ 1 )V( W <f+ l )( F pr,pV+ F PWP , )°i <4<V <V> 

p*p’i(r*q' 

P<Q 


K = 2 -#rr»r+ 2 2 E pp’ ^»p «J> “p+ W P' + 2 2 8f» )(•*!■«»+) + 

r r&Ss 

+ 222 n r ( F pr,p-r+ F pr^n p “p V +V + 

r>p p¥-2>' 

+ 2?22 ^pq;p’q’ + ^pq^qp^^p ^ n q a q ' a j>' 

p<q\p*p;q'*a 


86S6.C 


3 O 
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This result can further be simplified if we replace the numbers n r by 
operators, represented by the diagonal matrices 


0 

0 

0 

0 . . . 

0 

1 

0 

0 . . . 

0 

0 

2 

0 ... > 

0 

0 

0 

3 . . . 

. . 

. 




and represent accordingly the operators a r and a f r (from the point of 
view of n r ) by the matrices 


ro 

1 

0 

0 . . 


ro 

0 

0 

0 


0 

0 

1 

0 . . . 


l 

0 

0 

0 . 


0 

0 

0 

1 . . . 

», aj == - 

0 

1 

0 

0 . 


0 

0 

0 

0 . . . 


0 

0 

1 

0 . 


• 



. . . J 


L • 






, (415a) 


which are a generalization of the matrices used to represent the opera¬ 
tors n r , qlj., and aj in the antisymmetrical case. We can then put, since 




0 0 0 

0 1 0 

* 0 0 1 

l • • • 


where 


n r 

= 

Vn r acj cnj. V/? r , 

0 

0 

0 

0 . . . 

0 

Vl 

0 

0 . . . 

0 

0 

V2 

0 . . . 

0 

0 

0 

V3 . . . 


(415b) 


and combine the first two terms of K into a single one, 

r r' 

the summation over r and r' being unrestricted (i.e. vanishing terms 
being cancelled out automatically). 

The other three terms corresponding to the mutual energy F can 
likewise be combined into a single one, 

2 2 &p' ^p't 

P<Q 

the second term corresponding to the case q' = q and the first to the 
case q f = q;p' = p. It should be noticed that the term is sub¬ 

tracted automatically in virtue of the relation = ^(^—1), which 
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reduces the product 

4n r al 4n r ol\ a. r 4n r a ^ 4n r — 4n r ol\ n r 4n r 
to 4n r oi!l 0 ^( 71 ,.--l)4n r = n r (n r — 1 ). 

It now remains to introduce, instead of the operators n n oc r , and a*, 
combined operators which can be defined by the formulae 

b r — oc r 4n n b\ — 4n r oc]. (416) 

or by the relations 

b\b r — n r , b r bi=.n r +l (416 a) 

following therefrom, and which will be subject to the commutation 
conditions , , , , A x 

6 A-M .-0 

I. + 1 + it H I v ' 


(416 a) 


b r b a —b„b r 


blbl-blbl^o) 

b r bl-b\b r = S n (417 a) 

(the latter being in agreement with (416a) in the case r — s). With 
the help of these operators, which are quite similar to the operators 
a r , a\ [differing from the latter by the sign only in the commutation 
relations (417 a, b)], the operator K can be written in the form 

A'= 21 E r ,b\b,+\ 2 222 F rs ,.,bibib,b, (417 b) 

r r' r s r’ s' 

which can be obtained from (402) by replacing the a’s by the b' s. It 
should be noticed that the order of the two last factors in (417b) is 
irrelevant [while it is very important in (402)] since they commute with 
each other. 

The commutation relations (417 a), just like their analogues (403 a) 
and (413 b), are actually self-supporting and can be used to define 
the operators n r by one of the expressions (416). The fact that the 
characteristic values of these operators are equal to 0, 1 , 2 ,... can be 
considered as a consequence of the relations (417) and (417 a). We 
need not repeat in detail all that has been said in the preceding section 
about the operator m __ v n 


representing the total number of particles, and the functional operators 

v(*) = 2 b l4>A x )’ vV) = 2 blM*)- (* 18 ) 

r r 

It need only be stated that they satisfy the relations 
y(x)y(a;')-V(z')v(*) = 0 \ 
v|ff(x)qff(a;')—^(x'Jvjfffa:) = 0 ) ' 

vp(x)vp t (x')—^(x')vp(x) = 8(x— x') 


(418b) 
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and can serve to express the energy operator in the form 

K = j y^(x)Ey(x) dx + \ JJ dxdx f , 

the order of the first two or of the last two factors in the double integral 
being irrelevant. In the case of a stationary state of the system of 
particles the equation of motion reduces to the form 

KC = W.C 

which can be derived from the variation principle 8 2 C* KC n — 0 in 
conjunction with the condition 2 C? = 1* The same principle when 
applied once more to the operator K itself, i.e. in the form 

2 0*8X0,= 0, 

n 

leads to a double operator equation 

C'\e+ | F(x,x')y'(x')y(x') dx']p(x)C = 0 

for the determination of the functions ip r (x). 

In the special case when there is no interaction between the 
particles (F — 0) the transition from the equations 


h dc r 

2ni dt 


24r'<V 


(419) 


describing the motion of a single particle, specified by the energy 
operator E and the wave function tp(x) = J c r 4 J r( x )> to the equations 

r 

-Tti dt (419&) 

for any number N of such particles (ll = ^ E r ) can be carried out, 

\ ri • 

according to Dirac, in the following way: 

The right-hand side of the equation (419) can be defined as the dif¬ 
ferential coefficient with respect to c* of the expression 

^ = 1 I c?4 V (420) 

r r 

which represents the probable value of the energy in the state specified 
by the wave function if/(x) = 2 c r 0 r (#) We thus get 


h dc r __ c)E 
~2 'rrijt ” dc*’ 

h dc r dE 

2 rri dt dc„ 


and in a similar way 
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469 

If we now put c r — — Q r , 

h c * _ p 

M T ~ r ’ 

(420 a) 

or c* = Q r , 

Ml _ 
«S>.i 

il 

(420 b) 

these equations can be rewritten in 

the form 


dPJdt = -8E/8Q r , 

dQJdl = 8E\8P r 

(420 c) 


i.e. in the standard canonical form of the classical equations of motion. 
The variables Q r and P r can be identified with the generalized co¬ 
ordinates and momenta, the Hamiltonian E being a bilinear function 
of them both and the number of degrees of freedom being infinite. 
Let us now pass over from the equations (420 c) to the corresponding 
wave-mechanical equation 

< 42i) 

where to is the wave function, or probability amplitude for given 
values of the coordinates Q , the classical momenta P r being replaced 

by the differential operators —- f. . Or let us take the equations 

2tti dQ r 

(420) directly over into the quantum theory considering the variables 
Q and P as operators (matrices) w r hich satisfy the commutation relations 
QrQr-QrQr = P^-P^r = 0 

PrQr-~QrP r = K ^ 


Replacing here P r and Q r by their expressions in terms of c r and c *, 
we get 


CyC r C r C r ’ = 0 

c*c r .-c r ,c* = —8 rr - 
These relations are equivalent to the w r ave-mechanical relation 


(421a) 


c 


* 




(421 b) 


which follows from P r = We thus see that the coefficients 

r 2 t n dQ r 

c and c* satisfy exactly those conditions which have been established 
above for the operators b and b\ and can accordingly be identified 
with the latter. 

The application of the quantization process just shown ('second 
quantization’) to the coefficients c, c*, i.e. their replacement by the 
operators b and b\ thus leads us directly from the equations (419) 
(which with their conjugate complex can-be considered as a system 
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of canonical equations in the classical sense) to the ‘wave-mechanical* 
equation (421), that is, 




k\ 


CO — 


h d 

- OJ 

2 rri 3t 


or the equivalent (operator) equation 

l’lJ f E rr-K = W.co. 


This is no other than our previous equation 

KC - W.C 


with co replacing C and with an operator K of the form 

K = IZE n ,blb r ,, 

r r 


which corresponds to a system of identical particles describable by 
symmetrical wave functions in the configuration space without any 
interaction . 

In other words, the quantization of the equations (419), describing 
the motion of a single particle, leads us to an equation describing the 
motion of any number of such particles—provided they conform to 
the statistics of Einstein and Bose. The actual number of these particles 
is equal to one of the characteristic values of the operator 


N = 2 Kbr 

and remains a constant of the motion since N commutes with K. The 

motion of the whole assembly of particles is described by the operator- 

function / V , , V 

y(x) = 2 b r i/, r (x) 

r 

with the help of which the energy operator can be written in the form 

K = J yf(a;)^(a;) dx . 


An exactly similar scheme can be applied, according to Jordan and 
Klein, in the general case of a system of identical particles of the 
‘symmetrical type’ interacting with each other, if this interaction is 
represented by a ‘quasi-external’ potential energy of the form 

W( x ) — i j <I>*(x')F(x,x')i/i(x') dx', 

the operator of the proper energy being replaced accordingly by 
E+\V. We then get, putting <p(x) = 2 c r t/i r (x), 

v(x) = z%r*c*c,. 
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and consequently 


K — j t*(x)(E+%V)i/i(x) dx 

= j >fi*(x)Eip(x) dx +£ JJ i/i*(x)i/j*(x')F(x,x')iJi(x)i/)(x') dxdx', 

or ^ = 22 E rf<-*r C r'+i I I I I K<r.rY <£ C* C, c r - 

r r' r s r s 

with r r 

^r«rV = (^)rr' = JJ <p*(x)^*{x')F(x,x')^ r ,(x)lf, a .(x') dxdx’. 


It now remains to replace the numerical coefficients c and c* by the 
operators b and in order to obtain the energy operator K corre¬ 
sponding to the problem of many particles. 

It should be mentioned that the ‘quasi-classical equations’ for the 
coefficients c can be written in the general case in the same canonical 
form (420 b) as in the special case of no interaction, and that the transi¬ 
tion to the quantum (or doubly quantum) equations can be effected as 
before by treating the coefficients c* as the operators —d/dc r (or c r as 
b/dc*). 

The preceding scheme for carrying out the process of second (intensity) 
quantization could be applied in principle to the case of particles of the 
antisymmetrical type just as well as to particles of a symmetrical type— 
namely, by substituting the operators a instead of the b ’s for the coeffi¬ 
cients c. It would, however, be impossible in this case to consider the 
conjugate complex coefficients c* as differential operators —d/dc and to 
repeat with regard to the quasi-classical equations for the c’s and c*’s 
the same process which leads from the classical equations of the motion 
of a particle to the wave-mechanical equation. 

The operators b r and b\ are written by Dirac in the form 

b r = e i2n9 rl h yjn n b\ — \ f n r e- i2nd ^ h , (422) 

corresponding to the usual expressions for the coefficients c n y/n r 
playing the role of the .modulus and 27r6 r /h that of the argument or 
phase angle. It follows from a comparison of (422) with (416) that the 
operators e i2n9 ^ h and e~ i2n9rlh are no other than the operators ct r and 
considered before. Hence it follows that the operators 9 r can be 
represented from the point of view of the operators n r by the formula 


ft. 


h d 
2iri dn r 

We have in fact, applying the operator 

£ S' 


a ___ gi2n9,jh 


k\\dnj 


(422 a) 
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to any function of n r) 

= 2 e “/(”>+■> 

by Taylor’s theorem, and in a similar way 

e-^’,»f(n r ) = = 2 ( “Tr~ ^|£ r) =M-1)- 

If instead of considering 6 r and from the point of view of n r we 
consider b r and n r from the point of view of b\, we get, as has been 
shown before, « 

b ' = xv (422b > 

and consequently n r = Replacing 6J by b r as the basic quantity, 

we get likewise 


bt= -- 


and 


db T 
0 / 


(422c) 


Representations of a similar type are not possible in the case of the 
operators a and a 1 . 

Just as in the latter case, the operators b and b\ which are not Her- 
mitian, can, however, be reduced to Hermitian operators p and q by 
means of the relations 


b = \{q+ip), 6 f = \(q-ip), (423) 

which correspond to the relations (404). 


The operators p and q are represented by the matrices 



0 Vl 0 0 . . .' 


r o 

—Wl 

0 

0 . . .1 


VI 0 V2 0 . . . 


tV 1 

0 

—*V2 

0 ... 

q = ■ 

0 V2 0 V3 . . . 

P = ' 

0 

»V2 

0 

1 

r* . 
< 

CO 


0 0 V3 0 . . . 


0 

0 

tV 3 

0 . . . 


which follow at once from (415 a), (415 b), and (416), and are easily 
seen to coincide with the matrices representing the coordinate and the 
momentum of a linear harmonic oscillator (cf. Chap. Ill, § 13). Their 
non-vanishing matrix elements can indeed be written in the form 

Qn,n-1 = £n-l,» = j (423 a) 

Pn.n- 1 = Pn—l,n = J 

(where the index r has been dropped), and differ by certain propor¬ 
tionality coefficients only from the expressions (88 a) derived in § 13, 
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From (423) we get 

tfb = n = i[p 2 +g 2 — i(pq—qpj\ 

btf = n+1 = ib 2 +9 2 +»(M—«>)]. 

2 

whence P^~9P — -• (423 b) 

This reduces to the usual relation PQ—QP = hjl^i between the 
momentum P and the coordinate # if they are defined as ^j(hw/47r)p 
and <yJ(h/47Tto)q respectively. With the help of the preceding relations 
we find the following expression for n : 

» = i(i> 2 +? 2 -2), . (423c) 

which can be rewritten in the form 

4(P 2 +M^A(n+|)|i 

hr 

corresponding to the quantized values of the energy of a harmonic 
oscillator with the frequency v — o>/27r, n playing the role of the quan¬ 
tum number. 

These results bring us back to the elementary theory of the quantized 
waves which has been sketched in Part I, § 20, with the trivial dif¬ 
ference that we do not have to worry about the half-integral energy 
values of the harmonic oscillators representing the different states, 
since it is not their energy, but the quantum number n which gives 
the number of particles associated with the corresponding state. It is 
of more importance that we have now obtained an exact and general 
expression for the energy K of the system of particles in terms of the 
auxiliary variables 6 r , b\ t whereas it was assumed before without 

sufficient justification that this energy is simply equal to the sum 

00 

1 E r^r- In reality it reduces to this expression in the special case only 

r~ 1 

of no interaction and for a special (though of course most natural) 
choice of the wave functions ip n as corresponding to the stationary 
states, specified by the energy operator E (E r = E rr ). In this case the 
energy K can be expressed as a simple function of the Hermitian 
variables p and q, namely, 

^=i24r(P 2 +? i -2). 

Their introduction in the general case instead of the variables b r and 
bl would, however, lead only to a useless complication of the theory. 

It is interesting to find the harmonic oscillator variables replaced in the 
case of electrons (or any other particles described by antisymmetrical 

3 P 
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functions) by the spin variables a x , cr (f — a fact which could hardly be 
anticipated in the early development of the theory of quantized waves, 
given in Part I. 


48. Interaction between a ‘Doubly Quantized’ System and an 

Ordinary System: Application to Photons 

We have considered hitherto a system of identical particles,* with or 
without interaction, in a given external field of force (specified by the 
potential energy U 0 (x) or the operator E). We shall consider now the 
more general case of such a system in interaction with some system 
of a different kind which will be described to begin with in the usual 
way, i.e. by giving the coordinates of all the particles constituting it. 
The energy H of the combined system A-\-B will consist of three parts: 
the energy of A taken alone (H A ), that of B taken alone (H n ), and 
their mutual energy M = H AJi , which can be considered as a pertur¬ 
bation. 

The method of the ‘intensity quantization’ discussed in the two pre¬ 
ceding sections can easily be extended to the present case if the wave 
function Q describing the whole system in the method of configuration 
space is written in the form 

o (X,Y)==Zco„(Y,l) Xn (X), (424) 


where X and Y denote the totality of coordinates specifying the corre¬ 
sponding system, while a>„ denotes a symmetrical or antisymmetrical 
function of the coordinates x 1 ,x 2 ,... of the particles constituting X , 
according to the nature of these particles. Substituting (424) in the 


wave equation 



we obtain a system of equations 


h dw, n 

2rri dt 



of the same kind as before, with the only difference that H nn > must 
now be treated as an operator with regard to coordinates Y , and the 
‘coefficients’ w n as functions of these coordinates and of the time. 

Introducing the individual states (r — 1,2,...) serving to define 
the functions x» we shall thus obtain an equation of exactly the same 
sort as before for the coefficients considered as functions of the 
partition numbers^, » r >—» the coordinates F, and of the time, 
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with the energy operator 2 E { increased by H B and by the interaction 

energy M of A and B. Putting 

M = 2 V(*<> Y)> 

i 

where, in view of the identity of all the particles of A, V is the same 
function for all of them, we must simply add to the energy operator 
E r of an individual particle the function V(x , F). We thus get the 
following equation: 

H B + 2 2 [E r ,+v r AY)]c\c 8 + 122X2, K^c\cic,c r . 


h ti 
2 ni tit 


(425) 


where the operators C stand for a or for b as the case may be. 

It has been shown in Chap. VII, § 39, that it is possible to treat one 
part, B say, of a complex system A-\-B as a complete system by 
treating all the quantities referring to this part as matrices with ele¬ 
ments defined with respect to the different stationary states of A taken 
alone. This result has been proved by using for the function Q describing 
the whole system just the expansion (424) with the important restriction 
that the functions x should be exact solutions of the equation 

Ha X = X- 

This treatment can be conveniently applied to the present problem 
only when there is no interaction between the particles of A and when 
the individual functions ip r (x) are exact solutions of the equation 
Eip r (x) == E' r ifi r (x). In this case the symmetrical or antisymmetrical 
functions Xn(X) will also be exact solutions of the equation 

H A Xn — H A Xn> 

N 

where H A = Y E xi , and the theory of § 39 will be wholly applicable to 

i*i 

our problem. 

This application is derived directly from the equation (425) if we put 
F = 0 and E rr . = h rr > E' r . Denoting further the sum 2 E'r CIQ = 2 E'r n r 
by W An and putting 


to' 


we get + 


h d 

2ttI dt 


(425 a) 
(425 


This equation coincides with the equation (329 a) of § 39 if the operator 
of the interaction energy M is defined as 2 2 Y r AY)C\C 8 . As a 

r r' 

matter of fact, the result of its application to the function aj' n can be 
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written in the form ]£ M nn >w' n >, where n and n' denote two sequences of 
the partition numbers n v n 2 ,..., n r ,... and differing from 

each other (as in the previously considered case) by one of the numbers 
in the second sequence being greater and another less by 1 than the 
corresponding numbers of the first sequence. In other words, the 
matrix components of the interaction energy M appearing in (425 b) 
are taken with respect to collective states of the ‘ignored 1 part A of the 
system A-\-B which differ by just one particle jumping from one indi¬ 
vidual state to another, or, in other words, by a one-quantum jump in 
opposite directions of two of the quantized partition numbers n v n 2 ,... 
which specify the states of A. 

The system B can in its turn consist of a number of identical particles 
of a different kind from those constituting the system A (for instance, 
A may be a system of photons or protons and B a system of elec¬ 
trons). In this case it is possible to apply the method of intensity 
quantization to the two systems simultaneously, by defining the func¬ 
tions to v (Y, t) in (424) as symmetrical or antisymmetrical combinations 
of certain orthogonal and normalized functions faiy), <£ 2 (3/)>---> ^ r (y),--- 
describing a sequence of stationary states of the separate particles of 
B. We can then take the equations (425 b) as our starting-point and 
transform them by putting 

m 


where oj n (Y) depends (symmetrically or antisymmetrically) on the 
coordinates Y only. We can also—and this is perhaps a more natural 
procedure—carry out the two quantization processes simultaneously, 


starting from the original equation 


~ ^-Q = HQ and putting 
2m dt 


Q(X, Y,t) = Z 2 C mn (t)wJ Y) Xn (X). (426) 

m n 

We thus obtain an equation of the following form: 


where L and K are the quantized energy operators referring to the 
two systems A and B taken separately, while M is the operator of their 
interaction energy. If A is antisymmetrical and B symmetrical, we can 
use for L and K the expressions (402) and (417 b) respectively (affixing 
the indices x and y to the operators E and F in order to distinguish 
the particles of the two sorts), whereas the operator M is expressed in 
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this case by the formula 

M = 1211 V rrW a l <V (426 b) 

r r' $ 8' 

where v(x , y) is the interaction energy between one particle of the sort 
A and one particle of the sort B , and 

v,*.' = // tf( x )ti(y) v ( x >y)'J>A x )<PAy) dxd v- 

In the equations (426 a) C = C mn (t) is to be considered as a wave 
function whose arguments are the partition numbers m r and n s , or 
rather the corresponding operators, defined as b].b r and respectively. 
These results can be generalized further for the case of three or more 
systems of identical particles, for instance electrons, protons, and 
photons, interacting with each other. 

We are now going to consider more closely the particular case of the 
photons, i.e. light waves, in interaction with an ordinary material 
system, which for the sake of simplicity we shall suppose to consist 
of a single electron, forming with the fixed source of the external 
field in which it moves a hydrogen-like atom. The peculiarity of this 
problem lies in the fact that photons cannot actually be treated as 
ordinary particles. As has been emphasized in Part I (§24) the analogy 
between light and matter has a very limited scope, and the notion of 
photons must be considered as a useful fiction of the same sort as that 
of ‘phonons’ (sound-quanta). In applying this fiction to the interaction 
between light and matter we must remember in the first place the fact 
that the number of photons does not remain constant, photons being 
created in the act of emission and destroyed in the act of absorption. 
This fact excludes the possibility of describing a system of photons by 
the method of configuration space. Under such conditions a strict 
application of the intensity quantization scheme devised for ordinary 
particles to the case of photons is impossible. It is nevertheless possible 
to apply the final results to this rather fictitious case, thanks to the 
fact that we do not have to introduce any interaction between the 
photons. We must, however, suitably define the expression for the 
mutual energy between the photons on the one hand, and the material 
system (electron, atom) on the other, in terms of the partition numbers 
which describe the distribution of the photons over the different states, 
and, moreover, provide in a physically irrelevant way for a formal 
conservation of the total number of photons. 

This latter circumstance can easily be achieved by introducing an 
additional state of zero energy corresponding by definition to an actual 
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absence of the photons. Emission or absorption of a photon will be 
interpreted under this condition as the transition of a photon from or 
into the zero state. 

The total energy of the photons taken alone (if this part of the 
system is referred to as B) can thus be represented by the operator 

H b = 2 E n n r = jr hv r b\b r , (427) 

r o r--o 

where v Q ~ 0. The operators b , b f are introduced here not on the ground 
that a system of photons is describable in the configuration space by 
a symmetrical wave function, but because we know' that the photons 
conform to the statistics of Einstein and Bose, i.e. behave like material 
particles of the ‘symmetrical* type. It should further be remarked that 
the quantities E rr =. hv r are introduced here not by the general formula 
E rr ~ J 'PfE'Pr dy (since neither the operator E y nor the wave functions 
i/j(y) have a meaning for photons) but by way of definition. 

The part of the energy corresponding to the atom alone can be defined 
in the usual way. It thus remains to define suitably the interaction 

oc 

energy M — J V(X,y i ) J or rather the matrix elements V rr (X), the func- 

t‘~ 0 

tion V(X,y ( ) being itself just as meaningless as the operator E y . 

In looking for such a definition we can be guided by the classical 
expression for the energy of an atom or electron in the electromagnetic 
field of the light waves. This field, according to classical electro¬ 
dynamics, is fully determined by its vector potential A as a function 
of the coordinates and the time, while the scalar potential (f> can without 
any loss of generality be set equal to zero. The electric and magnetic 
intensity can be calculated with the help of A by means of the formulae 

E = —- —, H = curl A. 
c SI 


Now the energy of an electron in an additional field specified by the 
vector potential A is equal, if terms quadratic in A are neglected, to 


e k e A 
- Av = — Ap x , 
c cm 


where p x is the electron’s momentum. This formula can be taken over 
into the wave mechanics if p is defined as the operator V. In order 


to be able to treat this expression as the mutual energy of the electron 
and of the photons, it remains to split up A into separate parts, A i say, 
which may be assumed to correspond to the separate photons, and to 
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find the matrix elements of A { with respect to the different 'states’ 
r and s of the photons. Putting, for the sake of brevity, 


i)rs ' 


where P rs is obviously independent of the individuality of the photon 
(specified by the index i), we thus get for the energy of the electron 
with respect to the light waves the expression 


M = vll?r r 'Kb r '> (427 a) 

r r' 

whicli can be interpreted as the mutual energy of the electron and the 
photons. The problem is thus reduced to the determination of the 
matrix elements. 

The simplest way to determine them is based on the assumption that 
the perturbation energy (427 a) must be responsible for such acts as the 
emission or absorption of light only. This means that the non-vanishing 
elements P rK must correspond either to r — 0 or r' — 0. Since the 
number of photons in the zero state can be assumed to be infinite 
(i.e. actually indeterminate) the operators b 0 = a 0 Vn 0 and = Vft 0 aJ 
must also have infinite characteristic values, so that the matrix elements 
and P r0 must be infinitely small. All we need, however, is their 
products with 6J and b 0 . Denoting these products by vj. and v r re¬ 
spectively, we can reduce (427 a) to the form 

M -== p • 2 (v]!& r +V r 6J). (427 b) 

r 

The operator p -\\b r determines the probability of emission and the 
operator p-v r 6J the probability of absorption of a photon hv r . Our 
problem would be completely solved if we knew the dependence of v r 
and vj on hv r . This dependence can be found by comparing the quantum 
interaction operator (427 b) with the classical one 


p A 



where Ay is the harmonic component in the Fourier analysis of A with 
the frequency v r . The energy per unit volume corresponding to this 
component is equal to (E^) 2 /Stt, where E j? is the amplitude of the electric 
intensity (since in the case of light waves the amplitudes of the electric 
and magnetic vectors are numerically equal). Now according to the 
relation j ^ 

E= ~c It 
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we have E J 

c 

The energy corresponding to a given harmonic vibration in the whole 
volume V of the enclosure where they take place is thus equal to 

\TT^(Ay) 2 V. On the other hand, this energy must be equal to the product 

of hv r with the number of photons associated with the vibrations under 
consideration. We have therefore 


7 T 

2c 2 


^ r (A^V ^hv r n n 


whence ^A° == e J (428) 

This expression, multiplied by the phase factor cos(277r r £+y r ) = cos <f> r) 
must obviously correspond to the quantum expression 

vJ6 r +v r b\ 

which can be written in a similar way if we assume that v r = v\ and 
if further the operators a r == e i2rrB ' lh and = e~ i2ir&rlh are identified, with 
the complex phase factors and In the limiting case of very high 
characteristic values of n r we can treat vW r and a r as commuting 
(neglecting 1 compared with n r ) and write accordingly 

\lb r +V r bl = v r (b r +bl) 

= v r \n r (e i2 ” e ^ h -f e- i27r6 « h ) ~ 2v r Vre,cos <f> r . (428a) 


Hence it follows that — A* — 2v t Vn r , 

cm 


which is identical with (428) if we put 

+ e^Ih 
Vr ~ Vf ~ 


(428 b) 


The direction of the vector v r coincides with that of Ay, i.e. with the 
direction of the electrical vibrations. The wave equation which deter¬ 
mines the motion and interaction of the atom (electron) with the 
photons can be written accordingly in the form 


= i\H A +h V xK+v-vr(K+m”. 


(429) 


which can be obtained from the general equation (425) if we put F = 0, 
interchange x with y, and determine, as shown above, the interaction 
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energy matrix V rr .. Substituting in (429) to = a)'e~ i27rWntlh , where 

W n == 2 hv r n r , we can reduce the preceding equation to the form 
r 

= i[^+p-v r (6 r +6t)K, (429a) 

which is a special case of (425 b). 

Regarding M = 2 P‘ v r(^r+^r) as the operator of the perturbation 

r 

energy causing transitions between the stationary states of the atom 
(electron) with emission or absorption of radiation, we can determine 
the probability of such transitions by calculating the corresponding 
matrix elements of M. Now these matrix elements can be written in 
the form J', n ') = 2 (p-v^^+^W- 

r 

By the definition of the operators b we have 

b r u) n — a r *Jn r a)(n r ) = <J(n r +l)w(n r +l)oL r} 

whence it follows, in view of the orthogonality and normalization of 
the functions to, that the matrix element (b r ) nn > is different from zero 
only if n’ r — n r +1, all the other numbers of the two sequences n r and 
n' r (apart from n Q ) being the same. The value of this matrix element is 
equal in this case to *J(n r -\-1 ). For the matrix element (5j)„ n * we find 
likewise a non-vanishing value, namely, \ ! n r if n' r = n r — 1, all the other 
numbers of the two sequences being the same cf. [eq. (423 a), § 47]. 

We thus see that the probability of the emission of a quantum of 
frequency v r is proportional to 

l(P* V r)j,j r l 2 K+l), (429 b) 

while that of its absorption is proportional to 

KP-VrUj'lX, (429c) 

the proportionality coefficient being, of course, the same in both cases. 
The energies of the two states of the atom J and J' must differ from 
each other by an amount approximately equal to ±hv r . The fact that 
the absorption probability is proportional to the number of photons in 
the initial state, i.e. to the energy of the latter, is quite natural. It is, 
however, very remarkable to find that the emission probability is pro¬ 
portional to the number of photons not in the initial, but in the final 
state, being thus different from zero even if n r = 0, i.e. if no photons 
of the given sort were present at the beginning (except in the zero 
state). This result gives an interpretation of the spontaneous emission 
of light as stimulated by a photon which was initially in the zero state. 
The sum r? r -}-1 in (429 b) can be interpreted accordingly as the expres- 

3595.0 3 q 
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sion of the fact that the emission of light takes place in two ways, 
namely, as a result of the stimulative action of the light already present, 
the probability of this induced emission being exactly equal to the 
probability jof the absorption, and also spontaneously. The ratio n r : 1 
must therefore be equal to the ratio Bp! A of the probability of absorption 
or induced emission to the probability of spontaneous emission, A and 
B being the well-known Einstein coefficients (see Part I, §§ 17 and 18) 
and p the density of the energy per unit volume and per unit frequency 
range. 

This result can easily be verified. We have, in fact, 

P dv = 4 2 fi r hv r , 

V dv 

where the summation is extended over all the frequencies within the 
given range. Now', as has been show'll in Part 1, §§ 11 and 37, the number 
dz of free oscillations of any kind in an enclosure with a volume V, 
whose wave number lies in the range dk, is equal to 4 7rVk 2 dk. Applying 
this to light oscillations (with a given state of polarization) we get, 
since h = v/c, ± y 

c 3 

If n r is considered as a practically continuous function of the frequency, 
it can be assumed to have the same value for all oscillations within the 
small range dv. We then get 

] 47T 

p dv = - 7 n r hv r dz ~ -~n r hv z dv, 

V c 6 


whence 



which actually coincides with the ratio A/B found in Part I, eq. (103 a). 
We thus see that the theory of the emission and absorption of radiation 
developed in this section (and due to Dirac) has the advantage of inter¬ 
preting the spontaneous transitions with emission of radiation, actually 
combining such spontaneous emissions with the induced ones. 

It is easy to obtain from the above theory the absolute values of the 
emission and absorption probabilities. To do this w r e must multiply 

the expressions (429 b) and (429 c) by ( 7 T 2 /h 2 ) and further by ~ 

so long as we are interested in the emission or absorption not of a 
particular photon with the frequency v r and a given direction of motion, 
but of any photon with a frequency lying within a narrow range Av 
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irrespective of the direction of motion. In view of the unsharp character 
of the resonance, summation of all the transition probabilities within 
the range Av leads to a result which is independent of the actual 
magnitude of Av. 

The resulting probability of a ‘spontaneous’ emission, for example, 
per unit time and unit frequency range thus turns out to be 


i(p*v r : 


1 j,j' I 


2 7T 2 47rFv 2 

h 2 c® 


Substituting here the expression (428 b) for v r and denoting the com¬ 
ponent of p in the direction of the vector v r (i.e. the direction of the 
electrical oscillations) by p r> we get 


A — \(Pr)j,J J 


, e 2 2 t7 2 
hm 2 c 3 


v, 


dx 

or, if p r is replaced by m ~j~ = m2irv r ix r , 

A 8 7T 4 C 2 V 3 . . 2 

A = -jr \{X r )j.A 1 ’ 


which coincides with the formula (93) of Part I if we take account 
of a definitely polarized radiation only. 

In order to account not only for the emission and absorption but 
also for the scattering of radiation, we must consider the hitherto 
neglected term of the perturbation energy, which is proportional to 
the square of the vector potential A. 


Subtracting from the operator 



the operator which 


corresponds to .4 = 0, we find for M —the operator of the mutual 
energy between the electron and the light—the expression 


M 


- —A-p + ~A\ 
cm Imcr 


(430) 


differing from the previous one by the extra term e 2 4 2 /2mc 2 . 

In order to find its quantum interpretation let us put A = ]JT A,., 

r 

where A,. = AJcos <f> r denotes a harmonic component of A. This gives 

A = 2 2 A r -A # = 22 AJ-AJcos^cos^,, 

r s r s 

and consequently, according to (428 a), 
e 2 

M 8 = A 2 = 2m 2 2 V v. A- cos <j> r cos <f> 8 

= \m 2 2 v r -v g Vw r Vn s (e^+ e -^r)(e^+c~^), 
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which in view of the correspondence between the complex phase factors 
e^ } e~ 1 ^ and the operators a = e i2jr0lh , = e~ i2rr6lh can be considered 
as the approximate form of the operator 

M a = \m 22 v r-v,(6J6,+6J6 r ) = m £ 2 v r -v,6J6, 

r a r » 


if we leave aside extra terms of the type £22 v r’ y s(K K) which 

r s 

correspond to a double emission or a double absorption and which do 
not seem to have any real physical significance. Substituting here the 
expression (428 b) for v and denoting by 6 rs the angle between the 
directions of the electrical vibrations of the types r and s, we get 




e 2 h vp cos 9 rt 


m 


fcT-22S«‘- 


(430 a) 


This operator, considered as a perturbation energy, determines the 
probability of those transitions, in which one photon (hv r ) is absorbed 
and another (hv 3 ) is emitted. Since the state of the atom must not 
change [this follows from the fact that its coordinates do not explicitly 
appear in (430 a)], i.e. its energy must remain the same, the two fre¬ 
quencies v r and v g of the absorbed and emitted light must likewise be 
the same; we thus have to do with a change of its direction only. This 
is the normal coherent scattering. As has been pointed out in § 23, the 
scattered light can in reality be different from the incident one (as in 
the Raman or Compton effect). The above theory cannot be extended 
to such cases of combined scattering.f 


49. Electromagnetic Waves with Quantized Amplitudes; Theory 
of Spontaneous Transitions and of Radiation Damping 

The preceding theory (due to Dirac) can be greatly simplified if, fol¬ 
lowing Jordan, Pauli, and especially Heisenberg, we do not explicitly 
introduce the notion of photons but treat the phenomena of light from 
the point of view of the wave theory , replacing, however, the classical 
electromagnetic waves by waves (oscillations) with quantized amplitudes . 
Let </>(x, y , z, t) denote a plane harmonic wave of some quantity <f> charac¬ 
teristic of the electromagnetic field—electric or magnetic field-strength, 
scalar or vector potential, etc. It may be a wave travelling in a definite 
direction or a standing wave formed by the superposition of two waves 

t As a matter of fact, it is not strictly applicable even to simple scattering: if instead 
of the Schrodinger equation containing terms quadratic in the potential A, we used 
Dirac’s equation which is linear in A, we should obtain to the first approximation (corre¬ 
sponding to simple transitions) no scattering at all. 
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of the same frequency and amplitude, travelling in opposite directions. 
In the former case we can put 

<f>(z 9 y, z , t) = C k e i27r ^ r - V »+ Cl e ~ i2n( *' r - vt \ (431) 

where r is the vector with the components x, y , z and k the wave vector; 
the magnitude of the latter is connected with the frequency by the 
relation k = cv , c being the velocity of light. The two amplitudes C k 
and Cl must be conjugate complex quantities so that <f> may be real. 
The expression (431) can be rewritten accordingly in the form 

^{x.y.z.t) — A k cos 277(kr—vf)+l? k sin277(kT— vt), (431a) 
where A k and B k are two real coefficients. Taking the sum of the 
expressions (431) or (431a) for various magnitudes and directions of 
the vector k (forming a discrete or a continuous sequence) with suitably 
chosen complex amplitude coefficients C k (or A k , £ k ), we can represent 
the value of the quantity </> as a function of the space coordinates and 
of the time for any electromagnetic field in ‘empty space’, i.e. satisfying 
d’Alembert’s equation i 

= (432) 

It should be kept in mind, however, that this representation does not 
hold for an electromagnetic field produced by electric charges situated 
within the region under consideration, since such a field is determined 
by a non-homogeneous equation of the form 

v V-^=~ 4 ^, («2a) 

p being the volume density of the charges if (j> is the scalar potential, 
or the electric current density if </> is the vector potential. 

So long, however, as we are dealing with radiation , we may safely 
assume equation (432) to hold, and accordingly represent its general 
solution in the form of a sum (or integral) of the expressions (431). 

The transition from the classical electromagnetic theory of light to the 
quantum theory can be achieved in the simplest way (without intro¬ 
ducing the notion of light quanta) by regarding the amplitude coeffi¬ 
cients C k} Cl not as ordinary complex numbers but as non-commuting 
quantum operators proportional to the operators 6, 6* which have been 
used before with conjugate complex proportionality coefficients y ki y k 
which are determined by the normalization condition for the function 
</> it. Adding to k the further suffix $ to indicate the polarization 
(£ = 1, 2), we obtain the following quantum expression for a plane 
polarized harmonic wave of light, 


( 433 ) 
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The substitution of the operators b, b 1 for the coefficients C , C* secures 
the ‘quantization’ of all those quantities which are expressed as volume 
integrals of the square of (f> (extended over the whole region in which 
<j> is different from zero). 

Thus, for instance, taking the square of (433) and integrating over 
a volume V outside which <f> can be assumed to vanish, we get 

J dV = (433a) 

V 

the squares of the two terms of (433) giving no contribution to the 
integral on account of the periodic factors e ±/47rk ‘ r . 

Now by the definition of the operators b, b f we have 
b'b = N, bb' = N+l, 

where N is an integer or, more exactly, an operator capable of assuming 
integral positive values only. Affixing to it the suffixes k, £ which 
specify the oscillations under consideration, we thus get 

/ Ht dV = 2|y k . f !*F(AV f +}). (433 b) 

V 

If <f> K £ is identified with the electric intensity E , the expression (433 b) 
divided by 4tt can be interpreted as the electromagnetic energy W k g 
enclosed in the volume V (since the magnetic part of the energy is equal 
to the electric one). Putting 

YKt = J~ = rti (•' = «/.'), (433 c) 

we obtain for this energy the expression 

W - (N^+l)hv, 

which differs from that of the photon theory by the presence of the 
term | in the brackets (N being the number of photons). 

In order to get rid of this term one usually replaces the sum tfb+btf 
in (433 a) by 2 b^b, thus putting 

J ti ( dV = = 2|y k>f |^, 

which, however, is a wholly unwarranted procedure. It can be shown, 
however, that in the accurate expression of the electromagnetic energy 
which involves the sum of four terms (corresponding to the scalar 
potential and to the three components of the vector potential) or of 
six terms (corresponding to the three components of the electric inten¬ 
sity and the three components of the magnetic intensity) the J cancels 
out so that the energy reduces to an integral multiple of hv. 
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In the general case of an electromagnetic field represented by a sum 
of terms of the form (433) satisfying given boundary conditions (corre¬ 
sponding, for example, to radiation enclosed in a vessel with perfectly 
reflecting walls), the integral f <f> 2 d\\ on account of the mutual ortho¬ 
gonality of the different normal oscillations <£ k ^ reduces to the sum of 
the expressions (433 b) for all the values of k,f concerned. 

We shall now apply the method of quantized electromagnetic waves 
to the interaction between light and matter. The light will be con¬ 
sidered as a perturbation and the matter described in the usual way 
by a superposition of the stationary states that would persist in the 
absence of the perturbation, i.e. 


<A — 2 a r'l J r — 2 a r <fi r (x)e-' 2 ” v ''. 

r 

The amplitude coefficients a r will be treated to begin with as ordinary 
numbers; for the sake of simplicity the material system will be imagined 
to consist of a single electron bound to a fixed centre of force (hydrogen- 
like atom). 

The perturbation due to the light will result in the variation of the 
coefficients a with the time; this is determined by the well-known 
equations h da 

-sii 1 - 1 ' <434) 


The perturbation energy can be written in the form 

S = T(/> = ^ T<f) k £, (434 a) 

where T is some quantity characteristic of the atom, for instance, its 
electric moment if <f> represents the electric force. 

Substituting in (434 a) the expression (433) for </> k £, we get 

s = I y«(Ti & a e-'W+r- fct e*W), (434b) 

a 

g 

where the index a is an abbreviation for k,£; = — p n e ±i2nk * r if <j> 

cm 

denotes the vector potential, and v a = ck. Hence we get 

Srs = 2 y«{( T i )™ ( T-) rs bl 

a 

So far the present theory is formally identical with the previously 
considered theory of the perturbation produced by classical (i.e. non- 
quantized) electromagnetic waves. We can therefore use for the ampli¬ 
tudes a r the same approximate expressions as have been derived 
before [(175), .§ 22], It must be remembered, however, that the corre¬ 
sponding probabilities \a r \ 2 —just as the probability amplitudes a f 
(r s )—are to be dealt with not as ordinary numbers but as operators. 
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In order to obtain results comparable with the experimental data we 
must consider the characteristic values of these operators, or their 
probable values for a number of states corresponding to different charac¬ 
teristic values. We need not discuss here the method of calculating 
these probable values since in the applications they are usually known 
a priori. The important thing to be noticed is that the use of quantized 
electromagnetic waves involves the introduction of ‘second-order pro¬ 
babilities’, i.e. of the probability that the ordinary (‘first-order’) 
probability of some state (r) should have a given value, out of a number 
of possible characteristic values. Instead of directly giving the value of 
the transition probabilities, the operators |a r | 2 considered as functions 
of the time (with the condition that at t = 0 one of them only has 
a characteristic value different from zero), will serve to determine the 
probable (or average) values of these transition probabilities. 

Another important point is the fact that in calculating the probability 
operators |a r | 2 we must take into account the non-commutative character 
of the operators 6 a , 6* whose squares or products occur in the expression 
of the product of a r with a*. It thus becomes necessary to define in 
an unambiguous way the order in which the operators a r and a* must be 
multiplied by each other. This order being adequately fixed, the com¬ 
mutation relations which are satisfied by the operators b ai b £ enable 
one to incorporate in the perturbation theory of the radiative transitions 
those transitions which are classically distinguished as spontaneous on 
exactly the same footing as the ordinary ‘induced’ ones. 

We shall consider, just as in § 22 (or § 18, Part I), a radiation with a 
practically continuous spectrum (such as the thermal radiation in 
statistical equilibrium at a given temperature). Assuming the material 
system (atom) to be initially in a given state s, we get to a first 
approximation (r ^ s) 


1 v-* i z»£2Tr(*v,-v a V_1 pi2rr(v r ,+v a )t _1 \ 

a r = Aa r « 


where a° 8 = aJJ* = 1. v ’ 

Let us consider in the first place'a transition s -> r to a state of 
higher energy W r > W 8 under the condition of unsharp resonance 
with the electromagnetic waves in a small frequency range near 
v a = v r6 — (W r —W 8 )h. We can then drop the second term in (435) com¬ 
pared with the first one. It now remains to multiply a r by its conjugate 
complex, dropping all terms containing the 5 a ’s with different values 
of <x and to sum over the frequency range considered. 
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Before we do this we must, however, make the following important 
remark about the order of the factors in the product of a r with a*. 
According to (435) a r (r s) must be considered not as an ordinary 
number but as an operator of the same type as b\ its conjugate complex 
must be replaced accordingly by the adjoint operator 


<4 « (\a r y = 2 y a (Ti)%bl 


e -i27r(v„-v a V_J 


[which corresponds to the first term of (435)]. 

Correct results are then obtained if the operator which determines the 
probability of the state is defined by the product a\ a r and not by a r a\. 

In carrying out the summation over the different oscillations we can 
drop all those products ft^bp for which a ^ /? (in view of the supposed 
incoherent character of the radiation). This gives 


ai a r = a s* a »j% \(r~i )„ 


(Av) 

IKK +0 °- 


Av 


/ 


gl27r(v-v w y_J 

V — V rB 


d(v—v r8 ), 


where v 0 = v r8 is the resonance frequency, j(T+) r J 2 the average value 
of \(T+) r8 \ 2 for all the directions of the vector k with the fixed magnitude 
c/v 0 , and Av a small frequency range containing the resonance frequency 
and yet large enough to make the integrand very small compared with 
1 for v— v r8 — dtAv. The integral being equal to 47r 2 /, we thus get 

AV 

s ~«yl .POT 

Let Z v Av be the number of different oscillations in the frequency 

Av 

range under consideration, i.e. the number of summands in ^b^b^. 

For isotropic thermal radiation Z v Av = v 2 Av [cf. Part I, § 29, (141)]; 

c° 

we can then put 

Av -*-*”*' 

and consequently 7 

ala r = a'>*a«B+^hvE& x .t, (435a) 

where B+ = | 2 (v=v„). (435 b) 


IKK 


Z ft h 


Let us now consider the opposite transition r 8 due to the (unsharp) 
resonance with electromagnetic oscillations of the same frequency as in 

3 E 
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the preceding case. Reversing the indices s and r, we obtain for the 
probability amplitude of the transition r -> s the expression 


4 s , 




pi'ZTT{v, r ~v, x )t _] 


v sr~ v cx 

f i27r(v gr i v n )f _ 11 

4 -f T'\ 6 -_ _ l 

I V ^ (X hr u a . f 

v ar+ v a I 


(430) 


Since v sr = —v rs , and consequently v sr -f-v a ^ 0, we can now drop the 
first term of this expression and not the second one, which gives 

a\ a s = a"*a° r B~ Z v hv b^¥ a . (43G a) 

This differs from (435 a) by the inverse order of the factors 6 a and 
and also in a minor way through the substitution of B~ for Bf s . 

Now we have ^ N a , b&~N n - fl, 

where N a is the operator representing the (integral) number of light 
quanta associated with the oscillations of the type a. Passing from 
operators to probable values, we get 

^hrbTX-^Z^h^p,,, 

where p v is the spectral density of radiation per unit volume and 

>>*b\ y =- (A r a +1)A.' - P,(H 

Hence the probable values of the probabilities for the transitions s r 
and r -> s referred to unit time are 

= B+ Pvn if w r >w„ 

and 14, = B„{p^+ 8 ™ hv -j = A ra +B~p v „. (436b) 

We thus see that on the present theory 'spontaneous’ transitions from 
a state of higher energy to that of lower energy become completely 
fused with the induced transitions of the same type. The relation 

877V 2 7 


- hv 


A r9 = 


-hvB~ 


between the probability coefficients A and B referring to spontaneous 
and induced transitions is just that which has been obtained in Part I, 
§§17 and 18, by the method of ‘classical’ electromagnetic waves. The 
only difference consists in the multiplication of the quantity T charac¬ 
terizing the atom by the factors e ±i2Trk ' r characterizing the radiation, 
which corresponds to the introduction of two somewhat different coeffi¬ 
cients B% (for absorption) and B~ (for emission) instead of the single 
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one considered before. It should be remarked, however, that for an 
isotropic radiation characterized by all the directions of the vector k 
being equally probable, the two coefficients are identical. If, moreover, 
the wave-length A = 1 jk is large compared with the effective linear 
extension of the atom the factors e ±i27Tkr can be dropped altogether. 
The expression (435 b) reduces in this case to that obtained before 
(Part I, § 17), if T is defined as the electric moment of the atom in the 
direction of the electric intensity </>. Substituting the corresponding 
expression (433 c) for y a in (435 b), we get, since 

!^=mmh i2/j 2 +i*j 2 ), 

B rs = ^ 2 e*(l*n,l*+ly r »l*+|2r«l*). 


in agreement with (103), § 18 of Part I. 

As a second illustration of the method of quantized electromagnetic 
waves w'e shall apply it, following Rosenfeld, Weisskopf ani 1 Wigner 
to the problem of the radiation damping. 

Let us return to the perturbation equations (434) and let us assume 
for the sake of simplicity that S rs is different from zero for two states 
only, r = 1 and s —- 0 say (the diagonal elements S u and S m likewise 
vanishing). 

The equations (434) reduce under these conditions to the following 


two: 

h 

h 



-***' =fa - 

~2 nS 0 (J(lv 

(437) 

where 

/=|y«[(r+) u b a ew»-* 

°*+(T;) in bU^<^) 



g =i 


(437 a) 


a 


We shall assume that at the initial moment (t = 0) a x -- 1 and a 0 = 0, 
which means that the atom was initially in the excited state, and shall 
try to solve our problem more exactly than was done before (when 
a 1 was considered as constant) by putting 

a x = e~ 2rfTt . (438) 

This corresponds to a radioactive-like decay of the number of atoms 
in the excited state (1) owing to their spontaneous transition to the 
normal one (0). Substituting this expression in the second equation 
(437) and integrating, we get 


■x2 y “{ (r “ +) “ 6 ““ 




+(^« )oi K 


+ 

e i2n(v a -v l0 +Uy 




>r | 


(438 a) 
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The first term in this sum can be dropped so long as v a lies in the 
vicinity of y 10 just as in the derivation of (436a). 

In order to find the decay or damping constant T we must substitute 
this expression in the first equation (437) with due account of the order 
of the factors 6, 6 t and sum over all the a's in the resonance range 8v. 
In doing this we can drop the second term in f, for in view of the 
incoherent character of the oscillations the probable (average) value of 
vanishes both when a ^ and a -- ft, the only non-vanishing 
terms being those containing the products — AT a -f-l. We thus 
find 


* Pe- 2 ” ri 
i 


1 y s p~l 2TT1 l 

~hZ yi{TX 


gi‘27r(^ 10 -v a X 

u+ir - ' 


] 1 _ pi2n(v lti -v a -iV)t 

that is, r = -p) Yl(Ti) 10 (T-) 01 (N a +\) l .~± -(438b) 

n ^ n v io~ I 'a~ ?A ) 

Replacing here the operator i\ 7 a as before by its probable (average) 
value (c z j87rhiP)p v , and further replacing the summation over a by an 
integration over v with the expression Z^dv —- 8 nVv 2 dvjc 3 for the num¬ 
ber of ^-values in the range dv, we get 


r. 
h 2 


+ 0 O 

2 _ / c 3 \ r 1_g»2ir(*' 1 o-v-iT)< 

j -a^+irr* 


where denotes the resonance frequency v 10 ,... and (T£) 10 (T~) 0l the 
average value of the product (T£ )lo(^a )oi for all the directions of the 
vector k with the fixed magnitude k = v/c. The integral J appearing 
in this formula is easily seen to be independent of the value of the 
parameter T and to be equal to it. We have in fact, putting P = 0 
and v —y 10 = 

j = jfJ 


The first term obviously vanishes since the integrand is an odd function 
of f, while the second reduces to the well-known integral of Laplace, 
which is equal to rt. 

Thus, if we neglect the difference between the factors T+ and T~ 
replacing them simply by T (which is always permissible if the resonance 
wave-length, A = c/v 10 , is large compared with the effective dimensions 
of the atom), and take into account the relation (435 b), we obtain for 
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the constant P the following expression: 

whence 47rP = B 10 p v . (439) 

This quantity is usually denoted as the damping exponent since the 
number of atoms in the excited state a\a v decreases with the time as 
e~ Arrri . Under ordinary circumstances the second term in (439) is small 
compared with the first one, so that the damping constant is numerically 
equal to the probability of a spontaneous transition between the corre¬ 
sponding states. 

In the general case when the atom is initially in an excited state r 
from which spontaneous transitions are possible to several states of 
lower energy 5, the damping constant is equal to the sum of the corre¬ 
sponding transition probabilities 

4^ = 1^ (W„<W r ). (43ft a) 


The probability amplitude of the rth state a r decreases with the time 
like e~ 2rrTrt . Multiplying this expression by i/j t —- i$(x)e- i2nVrt we can 
treat the resulting function 

a r ifj r = ijj^(x)e~ i2n ^' r ~ iVr)t 


as representing damped vibrations , corresponding to a complex value of 
the frequency v r -~ iTJ.. Such damped vibrations starting at a certain 
instant t = 0 can be analysed into a series of undamped harmonic 
vibrations, according to the equation 


H-oo 

f(t) = e-«wOv-trrX — j A v e~ iim ‘dv (t 5s 0), 


where A„ — J f(t)e il ” vl dt = J €' 2 ’ T(v_,v+<rr)< dt 

0 u 


or 



e ~2n[r r ~i(v-v f )V fit 


_I_ 

27 T[r r — i(v — V r )]’ 


H,l* = 


1 


(439 b) 


This corresponds to an effective spectral width P r of the state in question 
—in agreement with the interpretation of complex energy values given 
in Part I, § 15, in connexion with the problem of radioactive decay. 
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50. Application of Quantized Electron Waves to the Emission 
and Scattering of Radiation 

If in the function 

*) = 2 a r 0 == 2 ^ri X ) e ~ i2nVrt > ( 44 0 ) 

r r 

representing the undisturbed motion of the electron, the coefficients 
a r are treated not as ordinary complex numbers, but as operators 
satisfying the relations 

e4a s -fa s a* = 8 rs , a r a s -f-a 8 a r = 0, ajaj+ajaj — 0, (440 a) 

ip will represent the motion of any given number of electrons, distributed 
over the individual states ip r , the number of electrons associated with 
a particular state r being defined as the characteristic value of the 
operator a\a r = n T (440 b) 

(i.e. 1 or 0); it should be remembered that the product a r a\ is equal 
to 1 —n r . 

It has been shown by Heisenberg that with this definition of \p corre¬ 
sponding to quantized electron waves , it is possible to give an adequate 
description of the emission (and scattering) of radiation in terms of the 
classical electromagnetic theory, if, following SchrOdinger, we replace 
the classical mechanical quantities (coordinates, velocities, etc.) by their 
average or probable values. 

This wave-mechanical theory of light emission has been discussed 
already in Part I, § 17, with the help of ‘classical’ (i.e. unquantized) 
electron waves as giving rise to classical electromagnetic waves. It has 
been shown there that light vibrations defined as ‘beats’ (‘difference 
tones’) between two electron waves have correct frequencies, but that 
their amplitude is proportional not only to the probability of the initial 
state but also to that of the final one—which contradicts the photon 
theory of radiation. Now r this contradiction can be removed if the 
‘classical’ electron waves are replaced by quantized ones; the resulting 
electromagnetic waves appear likewise as quantized although in a way 
somewhat different from that considered in the preceding section. 

The mechanical quantity which determines the radiation emitted by 
an atom can be defined according to SchrOdinger’s theory as the 
probable value of the electric moment of the atom 

P = 

If we are concerned with several electrons P must denote the sum 
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^ er {i where is the radius vector of the ith electron (with respect to 

the nucleus), and if* an antisymmetrical function of the coordinates 
of all the electrons, the integration being extended over the whole 
configuration space. Introducing quantized electron waves, we can 
represent the totality of the electrons by the three-dimensional operator- 
function (440), and replace the preceding expression for the probably 
value of the resulting electric moment by the operator 

P = J ^rtP^r dV y (441) 

whose characteristic values must be considered as the probable values 
of P. Just as for the quantized electromagnetic waves discussed in the 
preceding section, we are thus concerned with probabilities of the 
‘second order’, he. the probabilities of certain probable values of P, the 
corresponding second-order probability amplitudes C being defined by 
an equation of the form PC = P'C. As a matter of fact, we need 
not bother about these probabilities, for the quantity we are actually 
interested in, and which can be directly compared with the experimental 
facts, is the probable value P of the operator P, which, as we shall 
presently see, can usually be determined directly. 

It should be emphasized that the order in which the two factors 
ip 1 and ip appear in the expression (441) is an essential feature of # this 
expression, since these factors do not commute with each other. We 
should obtain wrong results if the operator P were defined as J ipPip f dV. 

Substituting in (441) the expression (440) for ip, we get 

P(0 - 114 a 8 P r ° s (441 a) 

r 8 

and consequently 

=-21 Ka, P?,(2™,,)V*-»(441 b) 

at r^s 

This expression can be considered as defining in the same way as in 
the classical electromagnetic theory the electric and magnetic field 
generated by the atom at sufficiently remote points. 

The electrical intensity in a given direction r, say, at a distance R 
from this atom (the unit vector r being perpendicular to R) is thus 
represented by the operator 

E T (R,t) = Kt-B/c), 

c being the velocity of light, that is, 

E r = E-+Ei = c -422 ala,(P0) n (27ry r ,)h^-m (4 42) 

r*8 
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where E~ corresponds to terms with negative frequencies and E* to 
those with positive frequencies. 

The electric field defined by (442) is an operator, of a type somewhat 
similar to that defined in the preceding section with the help of the 
operators 6, b\ the operators corresponding to b if v r8 < 0 

(W r < W 8 ) and to b* if v rs > 0 ( W r > W H ). The connexion between the 
two types of operators will be examined later on. We are concerned 
here only with the fact that in order to obtain the observed electric 
field we must take the characteristic or probable values of (442). In 
the absence of definite phase relations between the operators a r and a 8 
referring to different states, i.e. when the different harmonic terms in 
(440) are incoherent with regard to each other, the probable values 
of a\a e are equal to zero so long as r s, so that the probable value 
of (442) vanishes. This is practically equivalent to the fact that 
the average value of E r with respect to the time is equal to zero. The 
quantity we are interested in is, however, not the electric field-strength 
but the corresponding energy. According to the classical theory, the 
latter (or more exactly the energy-density) is proportional to the square 
of E t . In order to obtain the operator which serves to define the energy 
in agreement with the photon theory of radiation we must, instead of 
squaring E r , multiply E+ by E~ in the order stated (just as in the 
preceding section where FA was replaced by </> f and E~ by (f >). This 
gives 

i27r(v rt +v t > r ,Xt-Rlc) 

i 

(442 a) 

it being understood that v r8 > 0 if r > 8 (the index r is dropped for 
convenience). 

We shall take in the first place the time average of this expression, 
which can be done by keeping those terms only for which v ra +v 5V = 0, 
that is, r' = r and s' = s. We thus get 

r>8 

It should be mentioned that the same result is obtained by averaging 
over the phases of the operators a n etc., if they are assumed to corre¬ 
spond to incoherent vibrations. 

Now a} r a»a\ a T = ~<4 a s a r a l = a l a r a s a l = n r (l-n,), 

eA 1) =~ 2 2 Wr(1 ~ Ws)|P? ' l * (2m, « )4 - 


,2222 a l a > a »-^A^y, v r'>')P°rs P°s r e 


so that 


(442 b) 
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This formula shows that the intensity of the emitted light is equal to 
the sum of terms corresponding to a combination of two states (r, s), 
provided the upper state is occupied (n r = 1) and the lower vacant (n B = 0). 
This result is in complete harmony with what should be expected on 
the photon theory of light emission in connexion with Pauli’s exclusion 
principle. The formula (442 b) can thus be regarded as the improved 
version of the ‘classical’ wave-mechanical equation (92) of Part I, § 17, 
where the upper and lower states appeared in a quite symmetrical 
manner. Indeed, we come back to this result if we consider the ampli¬ 
tudes a r , a 8 as ordinary numbers and not as operators. 

If in the expression (441 a) a r and a 8 are multiplied by the damping 
factors e~ i2 ' nFrt and e~ i27rr » l 9 the light vibrations with the frequency 
v r8 = (W r —W 8 )/h due to the combination of the corresponding states 
appear as damped with the damping constant 

Ka — Fr+r* = 2 A rp -\- 2 A 3q . 

V<^r g<a 

The effective width of the spectral line emitted in a transition from 
one state to another is thus equal to the sum of the widths of both the 
initial and final states. 

We shall now investigate, with the help of the formula (442), the light 
emitted by an atom under the perturbing influence of ‘primary’ electro¬ 
magnetic waves, or, in other words, the phenomenon of the scattering 
of radiation. As has been shown in Chap. V, § 23, the interpretation 
of this phenomenon from the purely mechanical point of view neces¬ 
sitates the consideration of double transitions, which correspond to the 
second approximation in the solution of the perturbation problem. If, 
however, we consider the radiation emitted (scattered) by the perturbed 
atom, we can confine ourselves to the first approximation, which in 
conjunction with equation (442} gives equivalent results. 

Let us for a moment treat the coefficients a r as ordinary numbers, 
and define the electric field of the primary light waves by the expression 

E 0 = %(be- i27Tvt +b'e i27TVi ), 

where b is the (complex) amplitude of E9. Let us assume further that 
at the initial moment, t = 0, the coefficients a qi a q >, a q . are different 
from 0, while all the other coefficients a r , a g ,... vanish. We then get 
from (435), with T' F replaced by P a> the component of P in the direction 
a of the vector E°, and the summation over a by a summation over q: 


a r — AjtJ, — 2A 




g-i2-rr(i'-v rf V_ 






1 " 


3 S 
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Substituting this expression and the similar expression for the conjugate 
complex 


a\ = Aj a\ — — 


1 y 

2 h^L, 

a 


<(^)«rP - --, . 

1 ‘'ra — v »'ro+>’ J 


in the formula (441 a) for the electric moment or its projection in a given 
direction r and dropping small terms of the second order, we have 

P M = 22 [A, a\ «£-(e i27,l v'+ajjJ A 1 a r (P“) 9 . r e i2 ’ 7 ‘'«-'J. 

r (? 

If furthermore we drop irrelevant terms which do not contain the 
primary*frequency (they can actually be considered as fading away 
owing to the damping), we get, with the help of the relations 

v ra + v n'r " k> o'a “ *W> 


p w = -^222 l<ww’ 

n n' r ' L 


^ gi 2 n(v~vq’ t )l ^ Q-i 2 iT(v+v 9 ' 9 )l 


Vra+ v 


1 


+ 


[ p-i 2 rr(v~v q q)t pi 2 ir(v+v 9 ' q )t']\ 

b --+ & f ---- , 

v ,q~ v V rq+V \\ 

or, rearranging the different terms, 

PM = II[<Ku-. q be-^-^+a°Ml u- J ,bU^-^}+ 

q q' 

+ 22 [«S ta «'^aa+ e ~ i2Tr(,; ’ fl,J ^+ a Y Gf ? u q'q &V 27r(v+v *''**]. (443) 


In this formula 




1 V (Pr)a'r(n)rq 
2h Z v rq -v 


,+ = i y (p°mp; w 

M ' 2ftZ v r? +v 




IV (P °u ^w 

2* Z r r8 —V 




1 V (P?) gV (P«) rfl 

v rg+ v 


L (443 a) 


The electric field strength of the scattered radiation at a distance R } 
E r (t) — — c 2 ^ ~ ’* s ^ us &* ven by the formula 


EM = - 

+aj f a^ 6 t e * 27 ^ -»y „ 

— ^{[2tt(i/+ ^' a )] 2 K ta §' *4r be- i2 * v + v *’<W- RIC )+ 

+<ag U Q'Q 6 V^+VirXi-^/c)]}. (443 b ) 

Although in the preceding calculation the coefficients a r , etc., were 
dealt with as ordinary numbers, the results obtained remain valid if 
we regard them as operators, since in writing down the products a\ 
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etc., we have always preserved the correct order of the factors. The 
smallness of the first-order coefficients Aj a r must be understood to mean 
in this case the smallness of the average (or probable) values of the 
corresponding probabilities |a r | 2 (i.e. the predominance of the charac¬ 
teristic values |a r | 2 == 0). 

Let us consider separately the special case when the atom is supposed 
to be initially in a definite state q. The double sum (443) reduces in 
this case to the single term 


where 


P r (t) = aJ t a®^ (W (6e“ i2rrW +6 t e' i ' 2,7 ' , ' / )j 

_1V yJPi) Qr (P°r)r Q 


(444) 
(444 a) 


and the electric field strength (443 b) to 


E T (t) = 



(444 b) 


The scattered radiation has in this case the same frequency as the 
primary one. This is the so-called simple or Rayleigh scattering. In 
the general case of equation (443 a) we obtain in addition to this simple 
scattering a ‘combination’ or Raman scattering with a number of 
modified frequencies Ve¬ 
in order to obtain the average energy of the scattered rays we must 
take the square of E T or, more exactly, the product of Ef with E~. 
For Rayleigh scattering this gives 


E+ E~ ~ w'fja^a^b, 

that is, Ef E~ ~ w^n-b^b = w\^n q b^b, (445) 

since the characteristic values of w 2 and n q are the same (1 or 0). 

For the Raman scattering the situation is somewhat more com¬ 
plicated. We shall consider separately the scattered rays with the 
frequency v—v q > (J and those with the frequency 

Taking the time average of E+ E~, according to (443 b) we get 

^v-vq'q r * mJ ( V Vq'q^Wqq’Uq'gClQ^Cl^'ClpClqb^b. 

that is, Jy-vqq ~ (y— v q'qY u qq‘ u q'q n K l *#); (445a) 

and in a similar way 

J v+v t ', ~ ('■'+<?- °v a Q a-Va-l-tth 
or J v+vrt ~ (v+v i , q ) i u^ v u^nl-(l~-nl)b%. (445b) 

These results are in harmony with the experimental data and with the 
elementary theory (due to Smekal) of the Raman effect, based on the 
idea of photons. In order to secure complete agreement we must make, 
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however, the additional assumption that v qfq is positive, i.e. that 

W Q > > W q . 

The scattered photon with the decreased frequency v—v q ' q is obtained 
on this view if the atom was initially in the lower state (nj = 1), the 
higher state q f being vacant (n°> = 0). In the contrary case ( n = 1, 

= 0) the atom jumps from the higher state to the lower one, adding 
the energy hv q > q to that of the incident photon, which results in the emis¬ 
sion of the scattered photon with the increased frequency v+v q > q . 

It should be mentioned that the intermediate states r, which deter¬ 
mine the intensity of the scattered radiation through the factors u ± i 
in contradistinction to the final state q or q', need not be vacant, 
since the corresponding numbers (operators) n r do not appear in the 
equations (444a) and (444 b).—This can be explained by the fact that 
if some intermediate state r is occupied, the electron starting from the 
state q, say, is interchanged with the electron in the state r, which 
passes to the final state q . 

The probability amplitude of such double transitions q->r -> q' with 
interchange must be the same as for double transitions without inter¬ 
change, since the electrons are indistinguishable. 

The expressions 

and (v+v q . q )*u+. q u+., 

which are a measure of the intensity of the scattered radiation with 
the frequency are in agreement with the expressions (184) 

derived in § 23 for the probability of the double transitions which are 
responsible for the scattering. 

The preceding theory of the scattering process can be improved by 
taking account of the damping which is described by adding to the 
frequency v r of each state the imaginary term F r considered above. 
This correction becomes especially important in the neighbourhood of 
resonance. We thus get, for example, instead of (444 a), 

W - .... 1 V V( P S)gr(-P?U 

h£ v i-v*+ivr qr ’ 

where = r g -f P r is the damping factor for the line emitted in the 
transition between the states q and r. This expression remains finite 
when v = v^, determining the polarization and intensity of the so-called 
‘resonance radiation’. 

The radiation theory sketched above is inexact in the sense that it does 
not take into account adequately the retarded character of the electro- 
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magnetic actions. This has been done approximately by substituting 
the difference t—Rjc for t, where R is the distance of some point (centre, 
say) of the atom from the point in question. This approximation does 
not hold, however, if the wave-length of the emitted or scattered light 
A = cjv is of the same order of magnitude as or smaller than the linear 
extension of the atom. The electromagnetic field generated by the 
latter can be determined in this case by the classical expressions for 
the scalar and the vector potential 


<f>(r,t) = € f dV' \ 

j , (446) 

A(r>0 = e 

where R = |r—-r'| is the distance of the point considered from some 
point r' in the volume-element dV' of the electron-cloud. Here e denotes 
the charge of the electron, while 

p — (44(5 a) 

is the density of the cloud and j the corresponding current density .J 
According to SchrOdinger’s theory, the latter is given-by 


m 




(446 b) 


J 1 - .V — ^ A is the operator of proper momentum, whereas 


where u 

according to Dirac’s theory . ^ (440c) 

cy being the velocity matrix and i jt the operator corresponding to 
Dirac’s wave function. Substituting for the latter the expression (440), 
where x is an abbreviation for the geometrical coordinates and the 
spin-coordinate, we get 

r .s 

j = ala a ^{x)yt/j° s (x)e iim " 1 . 

r 8 

Before substituting these expressions in (440) we must replace x by x r 
(coordinates of the point r') and t by t' = t—Rjc. Now so long as R is 
very large compared with the atomic dimensions we can put 

R = R 0 — nr', 

where R 0 is the distance of the point r from the centre (nucleus) of the 


% More exactly, the operators whose characteristic values are the probable values of 
the respective densities. 
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atom and n = R 0 /i? 0 the unit vector pointing in the corresponding 
direction. We thus have 


p(r',0 = 

r s 


and a similar expression for j(r',£'). 

Replacing R in the denominator of the integrands in (446) by R 0 — 
which is permissible so long as R 0 is supposed to be sufficiently large 
—we obtain the following expressions for the electromagnetic potentials, 


where 


= ^22 <a s e nm "^ R ° lc) fra 

0 r « 

A(r, t) = -i- ^ 2 a ' a s e i2T, ''^‘- R « lc> g r8 

0 r « 

f n = J t f,y$e itny " a ' r ' lc dV 
g rs = f if,yy$e i2m '" n - r 'l c dV' 



(447) 


(447 a) 


The electric and magnetic field strengths can be calculated from (447) with 
the help of the classical equations 


E=-Vf-~, H = curl A, 
c ct 


(448) 


which give (if R 0 in the denominator of (447) is treated as a constant) 

E = oil 2 2 ay r a * e ‘ 27r, ' r ' {l ~ RJ, ' )27Tv n( n fr»— 6ra)> 

0 r a* 

H = — - 2 2 a ' a “ ~ ItJe) 27 rv n (n x g,. s ) . (448 a) 

0 r 8 

These expressions are easily seen to satisfy the relations 
H = nxE, E — -nxH 

characteristic of the classical radiation field. Indeed the only non- 
classical feature of the preceding equations besides the quantum fre¬ 
quencies v r9 (which appear just as well in the old Schrtidinger theory) 
is the non-commutative character of the coefficients a r . This feature 
becomes manifest, however, only when we pass to the calculation of 
the electromagnetic energy. 


51. Connexion between Quantized Mechanical (Electron) Waves 
and Electromagnetic Waves 

As we have already pointed out, in order to obtain a correct expres¬ 
sion for the energy (as well as for the other quadratic quantities) we 
must split up the linear parameters of the electromagnetic field <f>, A, 
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E, H into two parts: <j>~ , A~, E~, H~ and A + , E+, H + , corresponding 
to terms with negative and positive frequencies respectively. The 
energy density is then represented by the operator 

7j = — (E^E-+H+H-). (449) 

87 T 

In a similar way the energy stream (Poynting’s vector) is represented 

by the operator r 

K = f-(E+xH--H+xE-). (449a) 

877 

The negative and positive frequency terms of <£, etc., should not be 
identified with the operators <j> and <f>* which have been introduced in 
§ 49 with the help of the operators b, of the Einstein-Bose statistics. 
In fact the electromagnetic waves wp are now considering are not plane 
waves but spreading spherical waves, with amplitudes which vary as 
the reciprocal distance from the emitting atom and decrease expo¬ 
nentially with the time, the vibration (r,6*) being in fact damped 
according to the law e~ 277(Fr ' ir * )/ . These damped spherical waves are, 
moreover, quantized in a way different from the plane waves considered 
before, namely, through the operators and a\a n instead of the 
operators and b of the previous theory. 

It is interesting, how r ever, to note that the operators of these two 
types are to some extent very similar. If r > s (i.e. W r > If^), then 
a\a s obviously corresponds to b^ and a\a r to b rit (in the sense that the 
former relate to harmonic terms with positive frequencies and the 
latter to terms with negative frequencies). Putting accordingly 

a\a 8 — bf 8 and a\a r — b~ s , 

We get b~b+ = ala r ala = a\a s a r al — n s {\—n r ) 

brsbrs = a\a s a\a r == a\a r a t a\ = n r (l-n s ) 
and consequently b ” b + — b+ b~ = n s —n r . 

In the case of an emission due to the transition r -> s the characteristic 
values of n r and n 8 after the transition are n r — 0 and n s = 1, so that 
the preceding expression reduces to 1, just like 6 a 6£—In a similar 
way it can be shown that the operators at a 8 — b+ and a\,a r > = b~- 8 > 
commute with each other unless r' r or s' =£ s (if r' = r, then 
bra'bfs—bfsby# = while 6+ always commutes with bf 8 , and b~ 

with b^ 8 >. 

These results seem to indicate that it is neither necessary nor possible 
to build up a theory of quantized electromagnetic waves in empty space 
on the basis of the very restricted analogy between these waves and 
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the quantized waves representing the motion of ordinary particles which 
conform to the statistics of Einstein-Bose. The true relationship be¬ 
tween the electromagnetic waves and the quantized electron waves in 
three-dimensional space is probably much more adequately represented 
by the fact that the amplitudes of the former are quadratic in the 
amplitudes of the latter, the ‘symmetrical’ operators b being thus re¬ 
placed by quadratic combinations of the ‘antisymmetrica!’ operators a. 

The theory of quantized electromagnetic waves developed in § 49 
must therefore be regarded as a convenient though artificial method 
for dealing with radiation problems involving ‘spontaneous’ transi¬ 
tions, rather than the true picture of a physical reality. As a matter 
of fact, this method implies that the radiation emitted by an atom 
which is situated in a rectangular enclosure with reflecting walls 
is converted into plane standing waves, which represent the normal 
modes of electromagnetic vibrations consistent with the correspond¬ 
ing boundary conditions. Under such circumstances it is not neces¬ 
sary to consider the damped spherical electromagnetic waves which 
are emitted during the transition of the atom from one state to 
another, this transition along with the resulting change in the radiation 
field being described as a transition of the complete system: atom-f- 
radiation (in the form of normal vibrations) from one stationary state 
to another. It should be noted that this is exactly the same type of 
description as that used in the perturbation theory of ordinary transi¬ 
tions not involving any radiation effects: the transition is not investi¬ 
gated as a process with a definite course in time, it being simply assumed 
that this process brings the system from one unperturbed state to 
another. 

If we wished to consider the ‘spontaneous’ transition of the atom 
from a higher to a lower state as the result of its own radiation field, 
described by spherical waves, we should use a more complicated per¬ 
turbation method, involving damped vibrations, the transition appear¬ 
ing not as an instantaneous jump with a certain probability per unit 
time, but as a continuous process starting at t — 0 and ending at t — oo, 
with an effective duration of the order of l/A. 

It should be mentioned further that from this point of view (which 
seems to be the really correct one) the electromagnetic radiation ought 
to be considered always in conjunction with the matter by which it 
is emitted, absorbed, or scattered. In fact the radiation enclosed in 
an empty vessel with perfectly reflecting walls and considered as an 
independent dynamical system is merely a fiction, since its reflection 
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by the walls is actually due to the absorption and re-emission, or to 
the scattering, by the atoms constituting these walls. The absorption 
of radiation which, according to the method of quantized electro¬ 
magnetic waves in an enclosure, is simply a transition of the absorbing 
atom from a state of lower energy to that of a higher energy with the 
accompanying decrease of the energy of the corresponding electro¬ 
magnetic wave system by just one quantum, must be considered as the 
result of the superposition on the primary radiation, causing the transi¬ 
tion, of the secondary radiation emitted by the atom. This is the 
picture of the absorption process which is given by classical electro¬ 
dynamics, and it must remain fundamentally unchanged in a consistent 
quantum theory, where actual processes must only be replaced by 
probable ones. 

The current idea that the emission of radiation can be due only to 
a transition of the atom from a higher to a lower state is fundamentally 
wrong; the converse transition is just as well accompanied by emission 
of radiation, which, however, cuts down the primary radiation causing 
the transition, and is therefore manifested as the decrease—i.e. absorp¬ 
tion—of the latter. 

In the preceding discussion of the connexion between the quantized 
mechanical (electron) waves and the electromagnetic waves, the former 
were dealt with as the cause of the latter. This relation can, however, 
be reversed in the sense that the motion of the electrons is influenced 
by electromagnetic waves of external origin. This influence has been 
actually examined already by the method of the perturbation theory 
in the preceding section (in connexion with the scattering) and especially 
in § 49. It remains to be seen whether the two types of quantization, 
assumed for the two kinds of waves, are consistent with each other in 
this respect. 

The expressions obtained in § 50 by the perturbation theory for the 
amplitudes a r which were supposed to have initially a characteristic 
value zero, must obviously satisfy the general commutation relations 
a\a 8 ~\-a $ al — S r3 , etc. Assuming for the sake of simplicity that all the 
coefficients aj but one vanish, we get, preserving the order of all the 
non-trivial factors involved, 


a} r a r 


JL( 

13,1* 

4A 2 | 

(%-v) 2 


afftfba* -f 


jggj * 


a'^bb 1 ^ 


+ a number of harmonically oscillating terms which we shall leave 
aside, since their average value vanishes. 

3M5.« 3 T 
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Now the products and bb\ whether b and are defined as the 
amplitude-operators of the Bose-Einstein statistics or as the products 
of the type cfl p a 8 and a\a v (with suitably chosen values of p and s) 
commute with oj and We thus get 


a}a r = <a° 
and in a similar way 

a r a\ = aja® 1. 


1 

P 12 

6 f 6 

bb* 1 

4A 2 

L qr 1 

l(*v«-*') 2 + 

(*V a -M 2 J 

1 

\P I 2 

r wt , 

bUj 1 

4 h* 

\- r qr\ 

A 

1 

tO 

(> / r a + , ') 2 J 


We see from these equations that the relation ala r ~]-a r a f r = 1 will 
follow from the relation = 1 only if it is assumed that 

66 f — 6 t 6, that is, if b and b f are treated as ordinary (commutable) 
numbers. As to the relations a\a 8 ~j- a 8 a\ — 0, etc., they are easily seen 
to hold (if r ^ s); in fact, so long as oscillating terms are dropped, we 
get separately a\a 8 = a,aj — 0. 


52. The Quantum Electrodynamics of Heisenberg, Pauli, and 

Dirac. 

The absence of complete harmony between the mechanical and the 
electromagnetic waves from the point of view of their quantization is a 
very unsatisfactory feature of the preceding theory. It can be shown, 
however, to be due, at least to some extent, to the approximate form in 
which this theory has been developed hitherto. We shall now briefly 
consider its more exact formulation due to Heisenberg and Pauli. This 
formulation is at the same time a generalization, which treats the 
radiation field as but a special case of the electromagnetic field, pro¬ 
duced by matter and acting upon it, and includes ordinary electric and 
magnetic forces, treating them in the same way as radiation effects. 

The theory of Heisenberg and Pauli can be condensed into the 
following equations: 

1. The equation of motion 

f><+e^+eyA+y 0 m 0 c 2 ]</r = 0, (450) 

where ifi is Dirac’s one-column matrix with the four components 
<Pi> <Pa> <A4- 

2. The equations of the electromagnetic field 

( V2 -^§)^ = -WV | 

( V2 -^5) a =- 4 ^J 


( 451 ) 
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with the usual relations 

E=-VA— J£a, H = curl A (451a) 

c dt 

between the potentials <^, A and the field strengths E, H. 

3. The commutability equations expressing the quantization of the 
mechanical field according to the Pauli-Fermi statistics: 

ifi(x)^(x')+ip*(x')*p(x) = &(x—x') \ 

iP{x)i/j{x')+ ip(x')*p{x) = 0 1 . (452) 

f(x)f(x')+*pHx')f(x) = 0 J 

4. The commutability equations for the electromagnetic field in 
empty space (i.e. in the absence of matter, see below): 

hr 

E k (x)A^x')-A^x')E k (x) = -^ kl h(x-x‘), (453) 

2m 


A(*M<(*')-^/(*'M*(z) = o | (453a) 

E k {x)E i (x’)-E l (x')E k {x) = 0 i 

The equations (450) and (451) along with the quantum conditions 
(452) can be considered as a generalization of the equations (410) and 
(410 b) which have been established in § 46 as the exact equivalent of the 
SchrOdinger theory of a system of electrons described by unquantized 
0-waves in the configuration space, and acting on each other according 
to Coulomb’s law. This generalization consists in the introduction of 
the finite velocity of propagation c of electromagnetic actions, both in 
an indirect way—by substituting the relativistic equation of motion 
(450) for the non-relativistic one (410), and in a direct way—by sub¬ 
stituting the equations (451) expressing the law of the retarded action 
for the Poisson equation (410b). 

The differential equations (451) can be replaced by the explicit 
expressions for the ‘retarded’ potentials 


#M) 


A(r ,t) 


J |r-r' 


dV'+</>o(r,t) 


J |r-r' 


(454) 


dV'+A<>(r,t) 


where V = t — |r— r'|/c; <f>° and A 0 are arbitrary solutions of the homo¬ 
geneous d’Alembert equations 






(454 a) 
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satisfying the relation 

div A°+- — — 0. (454 b) 

c dt 

If we put <f>° — 0, A 0 - 0, that is, confine ourselves to the retarded 
potentials produced by the motion of the electrons which is described 
by the operator-function ip, the action of an electron on itself which 
may seem to follow from these equations is actually eliminated auto¬ 
matically owing to the commutation relations (452). The equations 
(452), (450), and (454) (with </>° — 0, A 0 — 0) must thus give the adequate 
description of the mutual action of the electrons allowing for the re¬ 
lativity and retardation effects. 

The weak point of the Heisenberg-Pauli theory consists, as it seems, 
in the introduction of additional quantization rules for the electro¬ 
magnetic field expressed by the equations (453). These equations do 
not follow from the equations (451) in conjunction with (452), but are 
postulated on the basis of the analogy between the light waves and the 
mechanical waves which describe the motion of particles conforming to 
the Einstein-Bose statistics. In order to obtain the commutability 
relations for the electromagnetic field, Heisenberg and Pauli (following 
an earlier paper by Pauli and Jordan) actually come back to the old 
mechanical theory of light, considered as vibrations of an elastic ether, 
and give the quantum-mechanical theory of these vibrations, based on 
the classical wave equations (454a). It is indeed possible to write down 
the latter in a form corresponding to the ordinary Hamiltonian equa¬ 
tions of motion of a system of material points for the limiting case when 
these points constitute a continuous medium. Replacing the classical 
Hamiltonian equations of the motion of such a continuous medium by 
the corresponding matrix or wave-mechanical equations, one obtains 
the equations for the quantized elastic or electromagnetic waves. The 
photons corresponding to these waves are thus introduced in exactly 
the same way as the phonons, corresponding to ordinary sound waves 
(Part I). The energy of electromagnetic (or ‘elastic’) oscillations of a 
given frequency v is thus quantized according to the usual formula 
( n+\)hv for the ordinary harmonic oscillator. In order to get rid of the 
\ it is necessary to modify the definition of the energy in the way 
shown in § 49 and § 50. 

It should be remembered that the above theory refers to the ‘free 
ether’, i.e. to empty space, without electric charges. This corresponds 
to the electromagnetic field which has been denoted above as (f>°, A 0 . 
Now such a field can be described, as is well known, without loss of 
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generality by putting <f>° — 0. Treating the components of the vector A 0 
as the coordinates of the particles of an elastic ether described by the 
Lagrangian function L = (E 2 —H 2 ) dV, one can define the electric 


field E°= -- 
c 


1 3A° 


dt 


as the quantity corresponding to the mechanical 

momentum of these particles. Hence we obtain the commutation 
relations (453), (453 a) which are merely the ordinary commutation 


relations 


Pkn Qln — QlnP/cn' ~~ &klK 


(i,Z= 1,2,3), 


etc., for a system of particles 1, 2,..., w,..., n\... in the limiting case when 
these particles form a continuum. 

It should be mentioned that this field can be represented as a super¬ 
position of plane harmonic waves—as has already been done in § 49. The 
commutation relations (453), (453 a) can be replaced accordingly by the 
relations 

Al(k)A n (k')-A n (k')Al(k) = — ~jS mn 8(k—k'), (455) 

to which we must add the relation 

ch 

<^(k)^(k')-^(k')^(k) = +^8(k-k'), (455 a) 

all other combinations being mutually commutable. These relations 
can be derived directly from the relations of § 49 for the operators 
b\ b representing the amplitudes of the harmonic terms with positive 
and negative frequencies respectively for the limiting case of an en¬ 
closure with an infinite volume. 

In order to preserve the above commutation relations for the electro¬ 
magnetic field in the presence of electric charges (electrons) it is neces¬ 
sary to modify Maxwell’s equations by the addition of small terms 

1 dd) 

proportional to the expression P 4 = div A -|-- or to its derivatives, 

c dt 

replacing the condition (454 b) by the additional commutation relation 

hr 

c[P A (x)<f>(x')-<l>(x')P A (x)] = gS(r-r'), (455b) 


where c is the above-mentioned proportionality coefficient which in the 
final result is set equal to zero. 

It has been recently shown by Dirac that it is possible to give a 
somewhat different (relativistically invariant) formulation of the 
Heisenberg-Pauli theory for a system consisting of a given number of 
electrons or indeed of electrified particles of any kind. In Dirac’s theory 
the particles are described by the method of the configuration space, 
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and their mutual action is defined implicitly through their coupling 
with the quantized electromagnetic field in empty space in conjunction 
with a certain restrictive condition imposed on the wave function. 

Let ip(x v t x \x 2 , t 2 ;... ;r v , t x ) be the wave function of the particles (elec¬ 
trons) each considered with its own individual time , and let further 
<f>(x,t), A (x,t) be the potentials of the quantized electromagnetic field, 
satisfying the equations (454a), (454 b) and the commutation conditions 
(455). Dirac’s equations can then be written as follows: 

( ff *+ 2 -y> = °’ 

where 

H k = Ck<f>(*k,l k )+CYk ~4)] + yo* m o c 2 (456a) 

is the Hamiltonian for the &th particle. 

The function ip must be actually treated as a matrix with respect to 
the stationary states of the field taken alone. These states correspond 
to the different plane harmonic waves specified by the wave-number 
vector k and the polarization quantum number f. Associating these 
with photons, we can regard the above treatment as a particular case 
of the general method of treating incomplete systems, explained in 
Chap. VII, § 39, the ignored part (B) of the complete system being the 
‘photon gas’. 

It could be argued that it must be possible in this way to give an 
adequate description of the mutual action between the particles, inas¬ 
much as their mutual action with the photons [ignored in the equations 
(454 a)] is represented by the energy operators 

M k = e kW x k> h)-yk-M*k’ l k)] (456 b) 

(the operator cy A p A; -f y kQ m kQ c 2 corresponding to the energy of the &th 
particle taken alone). 

This is, however, not so, for the relation between matter and field is 
expressed not only by this operator M , describing the effect of the 
latter on the former, but also by the terms etjdip and eip^ytp on the right 
side of the equations (451) which describe the effect of the matter on the 
field . It is obviously impossible to get rid of this side of their mutual 
relationship, and it must be introduced somehow, explicitly or implicitly, 
into the preceding theory in order to transform it into a theory not only 
of the motion but also of the mutual action between all the particles 
concerned. This is done by Dirac in the following manner; 

Let us come back to the complete system: electrons photons 
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(electromagnetic field), and let us consider ip as a function both of the 
x k , t k of the former, and of the x , t of the latter, it being understood 
that the system is doubly quantized with respect to the photons [which 
corresponds to the commutation relations (455)]. The equations (454a), 
(454 b) will then be rewritten in the form 



i ^ 

c 2 dt 2 




ip ■= 0, 


(457) 
(457 a) 


<f> and A being defined as certain operators acting on ip. The latter equa¬ 
tion can be considered as a constraint to which the function ip is subject 
Now in order to describe the influence of matter on the electromagnetic 
field this equation must be replaced by the following generalized 
equation: r , a ,/-» N 

|divA + i||J^ = 2 e *A(X-JQ. (458) 


X a = x 8 , y 8i z g) t 8 and A(X) is the so-called ‘invariant delta-function* 
(introduced by Jordan and Pauli) 


A(X) = I[8(r+d)-8(r-eQ] (458a) 


(it represents a spherical wave concentrated in an infinitely thin layer 
and travelling with the velocity of light from infinity so as to converge 
at the point r = 0 at t — 0 and then diverging again to infinity). Using 

the relations E = 
besides the equations 


1 dA 

V(/> —- —, H = curl A, one obtains accordingly, 


!« , l dH 

curl E 4-— 

' c dt 


0, divH = 0, 
which can be considered as identities, the equations 


N 

(div E)t/t = ~[| £ e*A(X-X,)k 

*- l •* 


(458 b) 


Let us now put t x = t 2 = ... = t N — t = T, i.e. introduce a common 
time for all the particles and for the field, and denote the corresponding 
complete derivative for any quantity / by dfjdT, so that 

2V 


dT 




l~T 


y 3] 

<■“ Ci 
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Then remembering the relations 

dA _ dA 

dt “ dT’ dt “ dT 

and with the help of the formula 


§52 


and 




I fi A(Z) : 

c Lot J/ — o 


-47rS(-r), 


we easily get, along with the trivial expressions 

E = —V<A—- t^;A, H = curl A, 


the equations 
and 


divA+ ^ ==0 ’ 


/curlH- 


r *) 




(459) 


(459 a) 


(459 b) 


(div E)i/< = 4rr ^ 
which are equivalent to 

and (v 2 A-i <A= - hjU Y *S(r-r*)]0. 

In the limit c = oo these equations, together with the equations of 
motion (456), reduce to the ordinary SchrOdinger equation for the 
N particles in the configuration space with the mutual 'potential energy 

XJ = "V 'V corresponding to the Coulomb forces. 

4<f 

53. Breit’s Formula. Concluding Remarks 

The theories of Heisenberg and Pauli and of Dirac have been 
hitherto in practice rather fruitless, that is, they have not led to any 
marked progress in the theory of the interaction of electrons. The only 
improvement over the simple interaction theory based on Coulomb’s 
law is represented by a formula originally derived by Breit from the 
general equations of Heisenberg and Pauli’s theory. Breit’s results 
amount to the following approximate expression for the mutual energy 
of two electrons 


W = -— 
r 2 


e 2 e 2 F yt-y 11 (y^Ky 11 


4 ] 


(460) 

r r ° ' 

where cy 1 and cy 11 are the respective velocity matrices of Dirac’s theory. 
This expression takes account of the electromagnetic (spin-orbit) and 
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magnetic (spin-spin) interaction and also to some extent of the retarda¬ 
tion effects. It can be derived in a much simpler way without any use 
of the Heisenberg-Pauli-Dirac electrodynamics. The simplest and most 
straightforward of these derivations is the following one due to K. 
Nikolsky.t 

The energy of an electron in an external electromagnetic field speci¬ 
fied by the potentials </>, A is given by the formula 

W = exf>— ey-A. (461) 


Let us imagine that these potentials are due to the retarded action of 
a second electron moving classically. Their values at a given point and 
instant r can be expanded in Lagrange’s series in the formj 




M = 0 


(__1)M 


k- 1 ) 


A = e yH)^[ r Mi) 

flirt 1 dTp\ c) 


(462) 


where v is the velocity of the electron producing the field and r its 
distance from the other electron at the instant r. It is natural to think 
that the quantum theory of the interaction can be obtained from the 
classical one by replacing the classical time derivatives d<j>ldr by the 
quantum Poisson bracket expression [//, </>] — ( 27rilh)(H<f>—<f>H ), where 
H is the Hamiltonian of the system formed by the two electrons without 
the interaction term W. The velocity vector v must naturally also be 
replaced by the matrix vector cy. We thus get 


d n <f) /2 ttI\ n 

dr ,L \ h / * 


1 )“- v C v JJ v (f>W- v \ 


(463) 


where 


C v n = 


n\ 

v\(n —v)! 

Here t corresponds to the common time T of the whole system, that 
is, of the two electrons (the electromagnetic field being no longer con¬ 
sidered as a dynamical system and playing an auxiliary role only) It 
is natural to define the corresponding energy li as the sum 

H = H l +H l \ (464) 


where H 1 and H n are the Hamiltonians of the two electrons taken 
separately, i.e. 

H l — cy^p^f-yJmoC 2 , H n — cY II *p ,I -|-yJ l m 0 c 2 . (464a) 


f Not yet published. The other derivations are due to Mollor, Roscnfold, and Schcrzer. 
J Cf. nay Lehrbuch der Elektrodynami/c, i, p. 184. 

3595.8 3 tt 
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It must not be supposed that this expression for H omits completely 
the mutual action of the electrons. In fact the operators p 1 and p 11 
must be considered as representing the total momentum of the respective 
electrons, including the 'potential momentum’ due to its partner, i.e. 

/m a 2 

P , = p« 


. _ ™0 V J I _ v ,i 


_ _ m » _v«-t-—v> 

V{l-(« n /c) 2 } ’ 


(464 b) 


in agreement with the approximate theory of § 38. This will become 
more apparent when we compare Breit’s formula with the result of the 
above theory. With the help of (461), (462), and (463) we get: 


W = W 1 ’ 11 = e*2 2 - ( - 

u-0v=0 ^* ' ' 

+Y , //‘'Y n r<*~ 1 //f i_v ], 

v 

where IP — ^ H\ Hr z ~ x . 


A-=o 


(465) 
(465 a) 


Dropping terms of the third and higher orders with respect to v/c 
(i.e. y), we have 


( 1 V I. V II\ 2 TT 2 P 2 

__ LJL) - _ (rH 2 —HtH + H 2 r) 


whence, according to (465 a), 

W-" = _ 2 ~~[rH I H n —H n rIf—H 1 rH lI +H I H n r] 

together with terms which are proportional to the square of H l or H 11 , 
which we shall neglect as having no physical meaning (they represent 
the action of a point-like electron on itself)!. 

Now the expression in the brackets [ ] can be put in the form 

H n (H l r-rH l )-(H l r-rH l )H l \ 


Using the formulae (464 a) for IP and H 11 we get 


and 


IPr-rIP 


h dr _ h cy u r 
27n dt 1 27 ri r 


H n (H l r-rH l )-(H'r-rH 1 )H n = Jt£-‘ 


(Y , t)(y“t)‘ 
r 3 J’ 


t These terms are physically irrelevant also for another reason, namely, because the 
squares of the matrices y 1 and -y n are equal to 1 (or rather to 3), whereas they must 
represent small quantities of the second order with respect to v l /c and v u jc. This diffi¬ 
culty has, however, an origin entirely different from the preceding one, being connected 
with the existence of states of negative proper energy. 
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which leads to Breit’s formula for TP' 11 , this expression being actually 
symmetrical with regard to the two electrons. 

The classical expression for W corresponding to Breit’s formula is 

w = 7-£[? v,v,,+ ^ (rv,)(rvl,) ]- (466) 

The second term must obviously represent the effect of the electro¬ 
magnetic interaction between the two electrons with due account of the 
retardation. Now in the non-relativistic theory of § 38 where this 
retardation was left out of account, the electromagnetic interaction was 
shown to correspond to a mutual kinetic energy 

T =(466 a) 

c 2 r 


which is quite different from the second term of (466). 

This difference is, however, greatly attenuated if we consider the 
total energy of the two electrons //+ W — or more exactly 

the classical expression which corresponds to it and which is obtained 
if cy is replaced by v, y 0 by — (v/c) 2 }, and the p’s by the expressions 
(464 b). We thus get 


H = v , {^r^ cT ^v , +^v")+ Wo cV{l-(« I /«)*}+ 

+ v,, { V { I _7;n 7c)2} v,, +^/ , )+-ocV{ i - ( t’' , /c) 2 } 


m Q c* 


2e 2 

J{1 — (v'/c) 2 } 7(1 — (v"jc) 2 } C l r V 


and consequently 


H+W 


> + “ 




i+r + 


V{ 1 - (V l /c) 2 } V { 1 - (v"l c ) 2 } r 

+ — 9 [? v 1 *v" — (r-v^r-v 11 )]. (466 b) 

2 c l \r r 3 J 


The first three terms in this expression represent the proper energy of 
the two electrons and their mutual potential energy, whereas the last 
one gives the energy of the electromagnetic interaction. Although still 
somewhat different from (466 a), it is, however, much more similar to it 
than the corresponding term of IF. We obtain a still closer similarity 
if we average over all the directions of the vector r, considering them 
as equally probable. We thus get 

(rv^fr-v 11 ) = $v 1 *v 1 I r 2 , 
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which gives 

W+w = 


j{T-(v'ic)*} + j{i-(v"icy>} + r + : 


(466 c) 


The factor f appearing in the last (electromagnetic) term is the same 
as that which is met with in the calculation of the electron’s mass as 


due to the electromagnetic mutual action of the elements of its charge 
(supposed to be distributed in a spherically symmetrical way in a finite 
volume). 

The above derivation of Breit’s formula is not free from objection, 
especially with regard to the definition (464) of the energy H. Tt could 
be slightly modified by adding W to the expression used before (this 
would not alter the results to the approximation considered). The 
important point is that any symmetrization of the expression for H 
leads to cancelling terms of odd degree in the products of y 1 and y 11 . The 
same result is obtained if in the derivation of Lagrange’s series for the 
potentials (/> and A we replace the retarded potentials by the mean 
value of the retarded and the accelerated ones. This symmetrization with 
respect to the time (which has been actually used for a similar purpose 
by Fokker) is equivalent to the symmetrization of the energy H with 
respect to the two electrons. This is natural since the time and the 
energy are dynamically conjugate quantities. 

We thus see incidentally that so long as we are using a symmetrical 
energy operator for two electrons, it is impossible to describe that part 
of their mutual action which is antisymmetrical in the two particles or 
in the time and which corresponds to the dissipation of energy by 
radiation. 


This reproach may not be applicable to the accurate form of the 
Heisenberg-Pauli-Dirac theory. This theory cannot be considered, 
however, as a satisfactory system of quantum electrodynamics for 
many other reasons. In the first place it is based on a fundamentally 
wrong interpretation of the relationship. between matter (electrons) 
and electromagnetic field (photons) as a formal analogy , the quantum 
theory of the electromagnetic field being developed accordingly as a 
wave-mechanical theory of the ‘ether’ in a somewhat disguised form 
adjusted to Maxwell’s equations. 

A second, more important, reason lies in the fact that material par¬ 
ticles are visualized as the primary things in Nature and are dealt with 
as unextended points with dynamical properties independent of those of 
the electromagnetic field, while the electromagnetic field is treated as 
but an auxiliary agent introduced for the description of their mutual 
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action and serving to determine their motion. It seems, however, more 
reasonable to think that the electromagnetic field is the primary and 
fundamental thing in Nature, the material particles (electrons and pro¬ 
tons) being derivable from it, and possessing no independent mechanical 
properties. This point of view corresponds to the latest development of 
the classical electrodynamics, culminating in the electromagnetic theory 
of mass. The mechanical momentum and energy—potential and kinetic 
—must be interpreted from this point of view as the approximate form 
of electromagnetic momentum and energy, directly connected not with 
the particles but with the electromagnetic field. The laws of motion 
can be derived accordingly from the principle of conservation of electro¬ 
magnetic momentum and energy, applied to separate electrons, if the 
latter are considered not as points but as extended bodies (spheres) and 
if the external force acting on them is supposed to be balanced by the 
‘inner’ force, due to their own motion. 

This classical theory w hich means the complete reduction of mechanics 
to electrodynamics has met with one serious difficulty, connected with 
the problem of the spatial extension or ‘structure’ of the electron. It 
is responsible for the fact that the electromagnetic theory of mass, or, 
in other words, the electromagnetic derivation of' mechanics, has re¬ 
mained without further development until now. The advent of the 
quantum theory did not in the least alter the situation, the modern 
wave or quantum mechanics being simply a modified form of the old 
mechanics of a point-like particle with a given mass. 

Now it seems quite certain that this new theory is in principle just 
as wrong as the old one, and that the next task-in the development of 
our theory of the physical universe will consist in the application of the 
quantum ideas to the electromagnetic field in such a waxy as to obtain 
the mechanical law r s as a corollary from the laws of conservation of 
electromagnetic energy and momentum. It is to be hoped that the 
main difficulty of the classical theory connected with the problem of 
the electron’s spatial extension will be eliminated by considering the 
electron as the product (and not the source) of the electromagnetic field, 
described in a consistent quantum way. One might, for example, define 
the electromagnetic field as a matrix from the point of view of the 
space-time manifold, i.e. as a matrix with the elements (x'\F\x"), where 
x is an abbreviation for x , y, z , t , the diagonal elements representing the 
probable values of the field at different points x' = x". The electron 
could be described accordingly with the help of a function D( \x'—x”\/a), 
similar to a Gaussian function, with a finite parameter a playing the 
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role of t v e electron’s radius, \x'—x" | being the four-dimensional distance 
between the points x' and x\ We are thus entitled to think that Dirac’s 
equation of motion will be replaced by an equation containing the 
electromagnetic momentum-energy tensor; the mass of the electron, 
instead of being introduced a priori as a parameter, being derivable 
from the quantum equivalent for its radius. A closer discussion of this 
question is, however, hardly possible at the present time. 
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